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Title of the Invention 



NUCLEOTIDE AND AMINO AOD SEQUENCES OF THE ENVELOPE 1 GENE OF 51 HEPATITIS 
C VIRUS ISOLATES AND THE USE OF REAGENTS DERIVED THEREFROM AS DIAGNOSTIC 
REAGENTS AND VACCINES 



Field Of Invention 
The present invention is in the field of 
hepatitis virology. The invention relates to the complete 
nucleotide and deduced amino acid sequences of the envelope 
1 (El) gene of 51 hepatitis C virus (HCV) isolates from 
around the world and the grouping of these isolates into 
twelve distinct HCV genotypes. More specifically, this 
invention relates to oligonucleotides-, peptides and 
recombinant proteins derived from the envelope 1 gene 
sequences of the 51 isolates of hepatitis C virus and to 
diagnostic methods and vaccines which employ these 
reagents . 



Background Of Invention 
Hepatitis C, originally called non-A, non-B 
hepatitis, was first described in 1975 as a disease 
serologically distinct from hepatitis A and hepatitis B 
(Peinstone, S.M. et al. (1975) N. Engl. J. Med. 292:767- 
770) . Although hepatitis C was (and is) the leading type 
of transfusion- associated hepatitis as well as an important 
part of community- acquired hepatitis, little progress was 
made in understanding the disease until the recent 
identification of hepatitis C virus (HCV) as the causative 
agent of hepatitis C via the cloning and sequencing of the 
HCV genome (Choo, A.L. et al. (1989) Science 288:359-362). 
The sequence information generated by this study resulted 
in the characterization of HCV as a small, enveloped, 
positive-stranded RNA virus and led to the demonstration 
that HCV is a major cause of both acute and chronic 
hepatitis worldwide (Weiner, A.J. et al. (1990) Lancet 
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° 335:1-3). These observations, combined with studies 
showing that over 50% of acute cases of hepatitis C 
progress to chronicity with 20% of these resulting in 
cirrhosis and an undetermined proportion progressing to 
liver cancer, have led to tremendous efforts by 
5 investigators within the hepatitis C field to develop 

diagnostic assays and vaccines which can detect and prevent 
hepatitis C infection. 

The cloning and sequencing of the HCV genome by 
Choo et al. (1989) has permitted the development of 

10 serologic tests which can detect HCV or antibody to HCV 

(Kuo, G. et al. (1989) Science 244:362-364) . In addition, 
the work of Choo et al. has also allowed the development of 
methods for detecting HCV infection via amplification of 
HCV RNA sequences by reverse transcription and cDNA 

15 polymerase chain reaction (RT-PCR) using primers derived 
from the HCV genomic sequence (Weiner, A.J. et al.). 
However, although the development of these diagnostic 
methods has resulted in improved diagnosis of HCV 
infection, only approximately 60% of cases of hepatitis C 

20 are associated with a factor identified as contributing to 
transmission of HCV (Alter, M.J. et al. (1989) JAMA 
262:1201-1205). This observation suggests that effective 
control of hepatitis C transmission is likely to occur only 
via universal pediatric vaccination as has been initiated 

25 recently for hepatitis B virus. Unfortunately, attempts to 
date to protect chinqpanzees from hepatitis C infection via 
administration of recombinant vaccines have had only 
limited success. Moreover, the apparent genetic 
heterogeneity of HCV, as indicated by the recent assignment 

30 of all available HCV isolates to one of four genotypes, I- 
IV (Okamoto, H. et al. (1992) J. Gen. Virol; 73:673-679), 
presents additional hurdles which must be overcome in order 
to develop accurate and effective diagnostic assays and 
vaccines. 

35 For example, one possible obstacle to the 
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development of effective hepatitis C vaccines would arise 
if the observed genetic heterogeneity of HCV reflects 
serologic heterogeneity. In such a case, the most 
genetically diverse strains of HCV may then represent 
different serotypes of HCV with the result being that 
5 infection with one strain may not protect against infection 
with another. Indeed, the inability of one strain to 
protect against infection with another strain was recently 
noted by both Farci et al. (Farci, P. et al. (1992) Science 
258:135-140) and Prince et al. (Prince, A.M. et al. (1992) 
10 J. Infect. Dis. 165:438-443), each of whom presented 

evidence that while infection with one strain of HCV does 
modify the degree of the hepatitis C associated with the 
reinfection, it does not protect against reinfection with a 
closely related strain. The genetic heterogeneity among 

15 different HCV strains also increases the difficulty 

encountered in developing RT-PCR assays to detect HCV 
infection since such heterogeneity often results in false- 
negative results because of primer and template mismatch. 
In addition, currently used serologic tests for detection 

20 of HCV or for detection of antibody to HCV are not 

sufficiently well developed to detect all of the HCV 
genotypes which might exist in a given blood sample. 
Finally, in terms of choosing the proper treatment modality 
to combat hepatitis infection, the inability of presently 

25 available serologic assays to distinguish among the various 
genotypes of HCV represents a significant shortcoming in 
that recent reports suggest that an HCV- infected patient's 
response to therapy might be related to the genotype of the 
infectious virus (Yoshioka, K. et al. (1992) Hepatology 

30 16:293-299; Kanai, K. et al. (1992) Lancet 339:1543; Lan, 

J.Y.N, et al. (1992) Hepatology 16:209A). Indeed, the data 
presented in the above studies suggest that the closely 
related genotypes I and II are less responsive to 
interferon therapy than are the closely related genotypes 

35 III and IV. Moreover, preliminary data by Pozzato et al. 
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(Pozzato, G. et al. (1991) Lancet 338:509) suggests that 
different genotypes may be associated with different types 
or degrees of clinical disease. Taken together, these 
studies suggest that before effective vaccines against HCV 
infection can be developed, and indeed, before more 
5 accurate and effective methods for diagnosis and treatment 
of HCV infection can be produced, one must obtain a greater 
knowledge about the genetic and serologic diversity of HCV 
isolates. 

In a recent attempt to gain an understanding of 

10 the extent of genetic heterogeneity among HCV strains, Bukh 
et al. carried out a detailed analysis of HCV isolates via 
the use of PCR technology to amplify different regions of 
the HCV genome (Bukh, J. et al, (1992a) Proc. Natl. Acad. 
Sci. 89:187-191). Following PCR amplification, the 5'- 

15 noncoding (5' NC) portion of the genomes of various HCV 

isolates were sequenced and it was found that primer pairs 
designed from conserved regions of the 5' NC region of the 
HCV genome were more sensitive for detecting the presence 
of HCV than were primer pairs representing other portions 

20 of the genome (Bukh, J. et al. (1992b) Proc. Natl. Acad. 

Sci. U.S.A. 89:4942-4946). In addition, the authors noted 
that although many of the HCV isolates examined could be 
classified into the four genotypes described by Okamoto et 
al. (1992), other previously undescribed genotypes emerged 

25 based on genetic heterogeneity observed in the 5' NC region 
of the various isolates. One of the most prominent of 
these newly noted genotypes comprised a group of related 
viruses that contained the most genetically divergent 5' NC 
regions of those studied. This group of viruses, 

30 tentatively classified as a fifth genotype, are very 

similar to strains recently described by others (Cha, T.-A 
et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89:7144-7148; 
Chan, S-W. et al. (1992) J. Gen. Virol., 73:1131-1141 and 
Lee, C-H et al . (1992) J. Clin. Microbio. 30:1602-1604). 

35 In addition, at least four more putative genotypes were 
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identified thereby providing evidence that the genetic 
heterogeneity of HCV was more extensive than previously 
appreciated. 

However, while the studies of Bukh et al. (1992a 
and b) provided new and useful information on the genetic 
5 heterogeneity of HCV, it is widely appreciated by those 

skilled in the art that the three structural genes of HCV, 
core (C) , envelope (El) and envelope 2/nonstructural l 
(E2/NS1) are the most important for the development of 
serologic diagnostics and vaccines since it is the product 

10 of these genes that constitutes the hepatitis C virion. 

Thus, a determination of the nucleotide sequence of one or 
all of the structural genes of a variety of HCV isolates 
would be useful in designing reagents for use in diagnostic 
assays and vaccines since a demonstration of genetic 

15 heterogeneity in a structural gene(s) of HCV isolates might 
suggest that some of the HCV genotypes represent distinct 
serotypes of HCV based upon the previously observed 
relationship between genetic heterogeneity and serologic 
heterogeneity among another group of single- stranded, 

20 positive- sense RNA viruses, the picornaviruses (Ruechert, 
R.R. "Picornaviridae and their replication" , in Fields, 
B.N. et al., eds. Virology, New York: Raven Press, Ltd. 
(1990) 507-548). 



25 Summary of Invention 

The present invention relates to 51 cDNAs, each 
encoding the complete nucleotide sequence of the envelope 1 
(El) gene of an isolate of human hepatitis C virus (HCV) . 

The present invention also relates to the nucleic 
30 acid and deduced amino acid sequences of these El cDNAs. 

It is an object of this invention to provide 
synthetic nucleic acid sequences capable of directing 
production of recombinant El proteins, as well as 
equivalent natural nucleic acid sequences. Such natural 
35 nucleic acid sequences may be isolated from a cDNA or 
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genomic library from which the gene capable of directing 
synthesis of the El proteins may be identified and 
isolated. For purposes of this application, nucleic acid 
sequence refers to RNA, DNA, cDNA or any synthetic variant 
thereof which encodes for peptides. 
5 The invention also relates to the method of 

preparing recombinant El proteins derived from the El cDNA 
sequences by cloning the nucleic acid and inserting the 
cDNA into an expression vector and expressing the 
recombinant protein in a host cell. 

10 The invention also relates to isolated and 

substantially purified recombinant El proteins and analogs 
thereof encoded by the El cDNAs. 

The invention further relates to the use of 
recombinant El proteins as diagnostic agents and as 

15 vaccines. 

The invention also relates to the use of single- 
stranded antisense poly- or oligonucleotides derived from 
the El cDNAs to inhibit the expression of the hepatitis C 
El gene. 

20 The invention further relates to multiple 

computer- generated alignments of the nucleotide and deduced 
amino acid sequences of the 51 El cDNAs. These multiple 
sequence alignments serve to highlight regions of homology 
and non- homology between different sequences and hence, can 

25 be used by one skilled in the art to design peptides and 
oligonucleotides useful as reagents in diagnostic assays 
and vaccines. 

The invention therefore also relates to purified 
and isolated peptides and analogs thereof derived from El 

30 cDNA sequences. 

The invention further relates to the use of these 
peptides as diagnostic agents and vaccines. 

The present invention also encompasses methods of 
detecting antibodies specific for hepatitis C virus in 

35 biological samples. The methods of detecting HCV or 



WO 95/01442 PCT/US94/07320 



antibodies to HCV disclosed in the present invention are 
useful for diagnosis of infection and disease caused by HCV 
and for monitoring the progression of such disease . Such 
methods are also useful for monitoring the efficacy of 
therapeutic agents during the course of treatment of HCV 
5 infection and disease in a mammal. 

The invention also provides a kit for the 
detection of antibodies specific for HCV in a biological 
sample where said kit contains at least one purified and 
isolated peptide derived from the El cDNA sequences, 

!0 The invention further provides isolated and 

purified genotype- specific oligonucleotides and analogs 
thereof derived from El cDNA sequences. 

The invention also relates to a method for 
detecting the presence of hepatitis C virus in a mammal, 

15 said method comprising analyzing the RNA of a mammal for 
the presence of hepatitis C virus. The invention further 
relates to a method for determining the genotype of 
hepatitis C virus present in a mammal. This method is 
useful in determining the proper course of treatment for an 

20 HCV- infected patient. 

The invention also provides a diagnostic kit for 
the detection of hepatitis C virus in a biological sample. 
The kit comprises purified and isolated nucleic acid 
sequences useful as primers for reverse- transcription 

25 polymerase chain reaction (RT-PCR) analysis of RNA for the 
presence of hepatitis C virus. 

The invention further provides a diagnostic kit 
for the determination of the genotype of a hepatitis C 
virus present in a mammal. The kit comprises purified and 

30 isolated nucleic acid sequences useful as primers for RT- 
PCR analysis of RNA for the presence of HCV in a biological 
sample and purified and isolated nucleic acid sequences 
useful as hybridization probes in determining the genotype 
of the HCV isolate detected in PCR. 

35 This invention also relates to pharmaceutical 
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compositions for use in prevention or treatment of 
hepatitis C in a mammal. 



Description of Figures 
Figures 1 A-H show computer generated sequence 
5 alignments of the nucleotide sequences of the 51 HCV El 
cDNAs. The single letter abbreviations used for the 
nucleotides shown in Figures 1A-H are those standardly used 
in the art. Figure 1A shows the alignment of SEQ ID NOs:l- 
8 to produce a consensus sequence for genotype I/la. 

10 Figure IB shows the alignment of SEQ ID NOs:9-25 to produce 
a consensus sequence for genotype Il/lb. Figure 1C shows 
the alignment of SEQ ID NOs:26-29 to produce a consensus 
sequence for genotype III/2a. Figure ID shows the 
alignment of SEQ ID NOs:30-33 to produce a consensus 

15 sequence for genotype IV/2b. Figure IE shows the alignment 
of SEQ ID NOs:35-39 to produce a consensus sequence for 
genotype V/3a. Figure IF shows the computer alignment of 
SEQ ID N0s:42-43 to produce a consensus sequence for 
genotype 4C. Figure 1G shows the alignment of SEQ ID 

20 NOs:45-50 to produce a consensus sequence for genotype 5a. 
The nucleotides shown in capital letters in the consensus 
sequences of Figures 1A-G are those conserved within a 
genotype while nucleotides shown in lower case letters in 
the consensus sequences are those variable within a 

25 genotype. In addition, in Figures 1A-E and 1G, when the 
lower case letter is shown in a consensus sequence; the 
lower case letter represents the nucleotide found most 
frequently in the sequences aligned to produce the 
consensus sequence. In Figure IE, the lower case letters 

30 shown in the consensus sequence are nucleotides in SEQ ID 
NO: 42 which differ from nucleotides found in the same 
positions in SEQ ID NO:43. Finally, a hyphen at a 
nucleotide position in the consensus sequences in Figures 
1A-6 indicates that two nucleotides were found in equal 

35 numbers at that position in the aligned sequences. In the 
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aligned sequences, nucleotides are shown in lower case 
letters if they differed from the nucleotides of both 
adjacent isolates. Figure 1H shows the alignment of the 
consensus sequences of Figures 1A-G with SEQ ID NO: 34 
(genotype 2c) , SEQ ID NO:40 (genotype 4a), SEQ ID NO:41 
5 (genotype 4b), SEQ ID N0:44 (genotype 4d) and SEQ ID NO:51 
(genotype 6a) to produce a consensus sequence for all 
twelve genotypes. This consensus sequence is shown as the 
bottom line of Figure 1H where the nucleotides shown in 
capital letters are conserved among all genotypes and a 
10 blank space indicates that the nucleotide at that position 
is not conserved among all genotypes. 

Figures 2A-H show computer alignments of the 
deduced amino acid sequences of the 51 HCV El cDNAs . The 
single letter abbreviations used for the amino acids shown 

15 in Figures 2A-H follow the conventional amino acid 

shorthand for the twenty naturally occurring amino acids. 
Figure 2A shows the alignment of SEQ ID NOs:52-59 to 
produce a consensus sequence for genotype I/la. Figure 2B 
shows the alignment of SEQ ID NOs: 60-76 to produce a 

20 consensus sequence for genotype Il/lb. Figure 2C shows the 
alignment of SEQ ID NOs:77-80 to produce a consensus 
sequence for genotype III/2a. Figure 2D shows the 
alignment of SEQ ID NOs: 81- 84 to produce a consensus 
sequence for genotype IV/2b. Figure 2E shows the alignment 

25 of SEQ ID NOs: 86-90 to produce a consensus sequence for 

genotype V/3a. Figure 2F shows the computer alignment of 
SEQ ID NOs: 93 -94 to produce a consensus sequence for 
genotype 4c. Figure 2G shows the alignment of SEQ ID 
NOs: 96 -101 to produce a consensus sequence for genotype 5a. 

30 The amino acids shown in capital letters in the consensus 
sequences of Figures 2A-G are those conserved within a 
genotype while amino acids shown in lower case letters in 
the consensus sequences are those variable within a 
genotype. In addition, in Figures 2A-E and 2G when the 

35 lower case letter is shown in a consensus sequence, the 
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letter represents the amino acid found most frequently in 
the sequences aligned to produce the consensus sequence. 
In Figure 2E, the lower case letters shown in the consensus 
sequence are amino acids in SEQ ID NO: 93 which differ from 
amino acids found in the same positions in SEQ ID NO: 94, 
5 Finally, a hyphen at an amino acid position in the 

consensus sequences of Figures 2A-G indicates that two 
amino acids were found in equal numbers at that position in 
the aligned sequences. In the aligned sequences, amino 
acids are shown in lower case letters if they differed from 
10 the amino acids of both adjacent isolates* Figure 2H shows 
the alignment of the consensus sequences of Figures 1A-G 
with SEQ ID NO: 85 (genotype 2c), SEQ ID NO:91 (genotype 
4a), SEQ ID NO:92 (genotype 4b), SEQ ID N0:95 (genotype 4d) 
and SEQ ID NO: 102 (genotype 6a) to produce a consensus 

15 sequence for all twelve genotypes. This consensus sequence 
is shown as the bottom line of Figure 2H where the amino 
acids shown in capital letters are conserved among all 
genotypes and a blank space indicates that the amino acid 
at that position is not conserved among all genotypes. 

20 Figure 3 shows multiple sequence alignment of the 

deduced amino acid sequence of the El gene of 51 HCV 
isolates collected worldwide. The consensus sequence of 
the El protein is shown in boldface (top) . In the 
consensus sequence cysteine residues are highlighted with 

25 stars, potential N- linked glycosylation sites are 

underlined, and invariant amino acids are capitalized, 
whereas variable amino acids are shown in lower case 
letters. In the alignment, amino acids are shown in lower 
case letters if they differed from the amino acid of both 

30 adjacent isolates. Amino acid residues shown in bold print 
in the alignment represent residues which at that position 
in the amino acid sequence are genotype -specif ic. Amino 
acids that were invariant among all HCV isolates are shown 
as hyphens (-) in the alignment. Amino acid positions 

35 correspond to those of the HCV prototype sequence (HCV-1, 
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Choo, L. et al. (1991) Proc. Natl* Acad. Sci. USA 88:2451- 
2455) with the first amino acid of the El protein at 
position 192. The grouping of isolates into 12 genotypes 
(I/la, Il/lb, III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a 
and 6a) is indicated. 
5 Figure 4 shows a dendrogram of the genetic 

relatedness of the twelve genotypes of HCV based on the 
percent amino acid identity of the El gene of the HCV 
genome. The twelve genotypes shown are designated as I/la, 
Il/lb, III/2a, IV/2b, V/3a, 2c, 4a, 4b, 4c, 4d, 5a and 6a. 

10 The shaded bars represent a range showing the maximum and 
minimum homology between the amino acid sequence of any one 
isolate of the genotype indicated and the amino acid 
sequence of any other isolate. 

Figure 5 shows the distribution of the complete 

15 El gene sequence of 74 HCV isolates into the twelve HCV 

genotypes in the 12 countries studied. For 51 of these HCV 
isolates, including 8 isolates of genotype I/la, 17 
isolates of genotype Il/lb and 26 isolates comprising the 
additional 10 genotypes, the complete El gene sequence was 

20 determined. . In the remaining 23 isolates, all of genotypes 
I/la and Il/lb, the genotype assignment was based on only a 
partial El gene sequence. The partially sequenced isolates 
did not represent additional genotypes in any of the 12 
countries. The number of isolates of a particular genotype 

25 is given in each of the 12 countries studied. For ease of 
viewing, those genotypes designated by two terms (e.g., 
I/la) are indicated by the latter term (e.g. la) . The 
designations used for each country are: Denmark (DK) ; 
Dominican Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India 

30 (IND); Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; 

Sweden (SW) ; Taiwan (T) ; United States (US) ; and Zaire (Z) . 
National borders depicted in this figure represent those 
existing at the time of sampling. 



35 
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0 Detailed Description Of Invention 

The present invention relates to 51 cDNAs, each 
encoding the complete nucleotide sequence of the envelope 1 
(El) gene of an isolate of human hepatitis C virus (HCV) . 
The cDNAs of the present invention were obtained as 
5 follows. Viral RNA was extracted from serum collected from 
humans infected with hepatitis C virus and the viral RNA 
was then reverse transcribed and amplif ied by polymerase 
chain reaction using primers deduced from the sequence of 
the HCV strain H-77 (Ogata, N. et al. (1991) Proc. Natl. 
10 Acad. Sci. U.S.A. 88:3392-3396). The amplified cDNA was 
then isolated by gel electrophoresis and sequenced. 

The present invention further relates to the 
nucleotide sequences of the cDNAs encoding the El gene of 
the 51 HCV isolates. These nucleotide sequences are shown 
15 in the sequence listing as SEQ ID NO:l through SEQ ID 
NO:51. 

The abbreviations used for the nucleotides are 
those standardly used in the art. 

The deduced amino acid sequence of each of SEQ ID 
20 NO:l through SEQ ID NO: 51 are presented in the sequence 
listing as SEQ ID NO: 52 through SEQ ID NO: 102 where the 
amino acid sequence in SEQ ID NO: 52 is deduced from the 
nucleotide sequence shown in SEQ ID N0:1, the amino acid 
sequence shown in SEQ ID NO: 53 is deduced from the 
25 nucleotide sequence shown in SEQ ID NO: 2 and so on. The 
deduced amino acid sequence of each of SEQ ID Nos:52-I02 
starts at nucleotide 1 of the corresponding sequence shown 
in SEQ ID N0s:l-51 and extends 595 nucleotides. 

The three letter abbreviations used in SEQ ID 
30 Nos:52-102 follow the conventional amino acid shorthand for 
the twenty naturally occurring amino acids. 

Preferably, the El proteins or peptides of the 
present invention are substantially homologous to, and most 
preferably biologically equivalent to, the native HCV El 
35 proteins or peptides. By "biologically equivalent" as used 
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° throughout the specification and claims, it is meant that 
the compositions are immunogenically equivalent to the 
native El proteins or peptides. The El proteins or 
peptides of the present invention may also stimulate the 
production of protective antibodies upon injection into a 
5 mammal that would serve to protect the mammal upon 

challenge with HCV. By "substantially homologous" as used 
throughout the ensuing specification and claims to describe 
El proteins and peptides, it is meant a degree of homology 
in the amino acid sequence to the native El proteins or 
10 peptides. Preferably the degree of homology is in excess 
of 90, preferably in excess of 95, with a particularly 
preferred group of proteins being in excess of 99 
homologous with the native El proteins or peptides. 

Variations are contemplated in the cDNA sequences 
15 shown in SEQ ID N0:1 through SEQ ID NO: 51 which will result 
in a DNA sequence that is capable of directing production 
of analogs of the corresponding envelope 1 (El) protein 
shown in SEQ ID NO: 52 through SEQ ID NO: 102. It should be 
noted that the DNA sequences set forth above represent a 

20 preferred embodiment of the present invention. Due to the 
degeneracy of the genetic code, it is to be understood that 
numerous choices of nucleotides may be made that will lead 
to a DNA sequence capable of directing production of the 
instant El protein or its analogs. As such, DNA sequences 

25 which are functionally equivalent to the sequence set forth 
above or which are functionally equivalent to sequences 
that would direct production of analogs of the El proteins 
produced pursuant to the amino acid sequences set forth 
above, are intended to be encompassed within the present 

30 invention. 

The term analog as used throughout the 
specification or claims to describe the El proteins or 
peptides of the present invention, includes any protein or 
peptide having an amino acid residue sequence substantially 
35 identical to a sequence specifically shown herein in which 
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° one or more residues have been conservatively substituted 
with a biologically equivalent residue. Examples of 
conservative substitutions include the substitution of one- 
polar (hydrophobic) residue such as isoleucine, valine, 
leucine or methionine for another, the substitution of one 
5 polar (hydrophilic) residue for another such as between 
arginine and lysine, between glutamine and asparagine, 
between glycine and serine, the substitution of one basic 
residue such as lysine, arginine or histidine for another, 
or the substitution of one acidic residue, such as aspartic 

10 acid or glutamic acid for another. 

The phrase "conservative substitution" also 
includes the use of a chemically derivatized residue in 
place of a non- derivatized residue provided that the 
resulting protein or peptide is biologically equivalent to 

15 the native El protein or peptide. 

"Chemical derivative" refers to an El protein or 
peptide having one or more residues chemically derivatized 
by reaction of a functional side group. Examples of such 
derivatized molecules, include but are not limited to, 

20 those molecules in which free amino groups have been 
derivatized to form amine hydrochlorides, p- toluene 
sulfonyl groups, carbobenzoxy groups, t-butyloxycarbonyl 
groups, chloracetyl groups or formyl groups. Free carboxyl 
groups may be derivatized to form salts, methyl and ethyl 

25 esters or other types of esters or hydrazides. Free 

hydroxyl groups may be derivatized to form O-acyl or O- 
alkyl derivatives. The imidazole nitrogen of histidine may 
be derivatized to form N-imbenzylhistidine. Also included 
as chemical derivatives are those proteins or peptides 

30 which contain one or more naturally- occurring amino acid 
derivatives of the twenty standard amino acids. For 
examples: 4-hydroxyproline may be substituted for proline; 
5 -hydroxy lysine may be substituted for lysine; 3- 
methylhistidine may be substituted for histidine; 

35 homoserine may be substituted for serine; and ornithine may 
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be substituted for lysine. The El protein or peptide of 
the present invention also includes any protein or peptide 
having one or more additions and/or deletions or residues 
relative to the sequence of a peptide whose sequence is 
shown herein, so long as the peptide is biologically 
5 equivalent to the native El protein or peptide * 

The present invention also includes a recombinant 
DNA method for the manufacture of HCV El proteins. In this 
method, natural or synthetic nucleic acid sequences may be 
used to direct the production of El proteins. 
10 In one embodiment of the invention, the method 

comprises : 

(a) preparation of a nucleic acid sequence 
capable of directing a host organism to produce HCV El 
protein; 

15 (b) cloning the nucleic acid sequence into a 

vector capable of being transferred into and replicated in 
a host organism, such vector containing operational 
elements for the nucleic acid sequence; 

(c) transf erring the vector containing the 

20 nucleic acid and operational elements into a host organism 
capable of expressing the protein; 

(d) culturing the host organism under conditions 
appropriate for amplif ication of the vector and expression 
of the protein; and 

25 (e) harvesting the protein. 

In another embodiment of the invention, the 
method for the recombinant DNA synthesis of an HCV El 
protein encoded by any one of the nucleic acid sequences 
shown in SEQ ID N0s:l-51 comprises: 

30 (a) culturing a transformed or transfected host 

organism containing a nucleic acid sequence capable of 
directing the host organism to produce a protein, under 
conditions such that the protein is produced, said protein 
exhibiting substantial homology to a native El protein 

35 isolated from HCV having the amino acid sequence according 
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° to any one of the amino acid sequences shown in SEQ ID 
N0s:52-102 or combinations thereof. 

In one embodiment, the RNA sequence of an HCV 
isolate was isolated and cloned to cDNA as follows. Viral 
RNA is extracted from a biological sample collected from 
5 human subjects infected with hepatitis C and the viral RNA 
is then reverse transcribed and amplif ied by polymerase 
chain reaction using primers deduced from the sequence of 
HCV strain H-77 (Ogata et al. (1991)). Preferred primer 
sequences are shown as SEQ ID NOs:103-108 in the sequence 

10 listing. Once amplified, the PCR fragments are isolated by 
gel electrophoresis and sequenced. 

The vectors contemplated for use in the present 
invention include any vectors into which a nucleic acid 
sequence as described above can be inserted, along with any 

15 preferred or required operational elements, and which 

vector can then be subsequently transferred into a host 
organism and replicated in such organisms. Preferred 
vectors are those whose restriction sites have been well 
documented and which contain the operational elements 

20 preferred or required for transcription of the nucleic acid 
sequence. 

The "operational elements" as discussed herein 
include at least one promoter, at least one operator, at 
least one leader sequence, at least one terminator codon, 

25 and any other DNA sequences necessary or preferred for 

appropriate transcription and subsequent translation of the 
vector nucleic acid. In particular, it is contemplated 
that such vectors will contain at least one origin of 
replication recognized by the host organism along with at 

30 least one selectable markers and at least one promoter 

sequence capable of initiating transcription of the nucleic 
acid sequence. 

In construction of the recombinant for expression 
cloning vector of the present invention, it should 

35 additionally be noted that multiple copies of the nucleic 
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0 acid sequence and its attendant operational elements may be 
inserted into each vector. In such an embodiment, the host 
organism would produce greater amounts per vector of the 
desired El protein. The number of multiple copies of the 
DNA sequence which may be inserted into the vector is 
5 limited only by the ability of the resultant vector due to 
its size, to be transferred into and replicated and 
transcribed in an appropriate host microorganism. 

In smother embodiment, restriction digest 
fragments containing a coding sequence for El proteins can 
10 be inserted into a suitable expression vector that 

functions in prokaryotic or eukaryotic cells. By suitable 
is meant that the vector is capable of carrying and 
expressing a complete nucleic acid sequence coding for El 
protein. Preferred expression vectors are those that 

15 function in a eukaryotic cell. Examples of such vectors 
include but are not limited to vaccinia virus vectors, 
adenovirus or herpes viruses. A preferred vector is the 
baculovirus transfer vector, pBlueBac. 

In yet another embodiment, the selected 

20 recombinant expression vector may then be transfected into 
a suitable eukaryotic cell system for purposes of 
expressing the recombinant protein. Such eukaryotic cell 
systems include but are not limited to cell lines such as 
HeLa, MRC-5 or Cv-l. A preferred eukaryotic cell system is 

25 SF9 insect cells. 

The expressed recombinant protein may be detected 
by methods known in the art including, but not limited to, 
Coomassie blue staining and Western blotting. 

The present invention also relates to 

30 substantially purified and isolated recombinant El 

proteins. In one embodiment, the recombinant protein 
expressed by the SF9 cells can be obtained as a crude 
lysate or it can be purified by standard protein 
purification procedures known in the art which may include 

35 differential precipitation, molecular sieve chromatography. 
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° ion- exchange chromatography, isoelectric focusing, gel 
electrophoresis and affinity and immunoaf f inity 
chromatography. The recombinant protein may be purified by 
passage through a column containing a resin which has bound 
thereto antibodies specific for the open reading frame 
5 (ORF) protein. 

The present invention further relates to the use 
of recombinant El proteins as diagnostic agents and 
vaccines. In one embodiment , the expressed recombinant 
proteins of this invention can be used in immunoassays for 

10 diagnosing or prognosing hepatitis C in a mammal. For the 
purposes of the present invention, "mammal rt as used 
throughout the specification and claims, includes, but is 
not limited to humans, chimpanzees, other primates and the 
like. In a preferred embodiment, the immunoassay is useful 

15 in diagnosing hepatitis C infection in humans. 

Immunoassays of the present invention may be a 
radioimmunoassay, Western blot assay, immunof luorescent 
assay, enzyme immunoassay, chemiluminescent assay, 
immunohistochemical assay and the like. Standard 

20 techniques known in the art for ELISA are described in 

Methods in Immunodiacmosis . 2nd Edition, Rose and Bigazzi, 
eds., John Wiley and Sons, 1980 and Campbell et al., 
Methods of Immunology, W.A. Benjamin, Inc., 1964, both of 
which are incorporated herein by reference. Such assays 

25 may be a direct, indirect, competitive, or noncompetitive 

immunoassay as described in the art (Oellerich, M. 1984. J. 
Clin. Chem. Clin. BioChem 22:895-904) Biological samples 
appropriate for such detection assays include, but are not 
limited to serum, liver, saliva, lymphocytes or other 

30 mononuclear cells. 

In a preferred embodiment, test serum is reacted 
with a solid phase reagent having surface -bound recombinant 
HCV El protein as an antigen. The solid surface reagent 
can be prepared by known techniques for attaching protein 

35 to solid support material. These attachment methods 
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include non-specific adsorption of the protein to the 
support or covalent attachment of the protein to a reactive 
group on the support. After reaction of the antigen with 
anti-HCV antibody, unbound serum components are removed by 
washing and the antigen- antibody complex is reacted with a 
5 secondary antibody such as labelled anti-human antibody. 
The label may be an enzyme which is detected by incubating 
the solid support in the presence of a suitable 
fluorimetric or calorimetric reagent. Other detectable 
labels may also be used, such as radiolabels or colloidal 
10 gold, and the like. 

The HCV El protein and analogs thereof may be 
prepared in the form of a kit, alone, or in combinations 
with other reagents such as secondary antibodies, for use 
in immunoassays. 

15 In yet another embodiment the recombinant El 

proteins or analogs thereof can be used as a vaccine to 
protect mammals against challenge with Hepatitis C. The 
vaccine, which acts as an immunogen, may be a cell, cell 
lysate from cells transfected with a recombinant expression 

20 vector or a culture supernatant containing the expressed 
protein. Alternatively, the immunogen is a partially or 
substantially purified recombinant protein. 

While it is possible for the immunogen to be 
administered in a pure or substantially pure form, it is 

25 preferable to present it as a pharmaceutical composition, 
formulation or preparation. 

The formulations of the present invention, both 
for veterinary and for human use, comprise an immunogen as 
described above, together with one or more pharmaceutically 

30 acceptable carriers and optionally other therapeutic 

ingredients. The carrier (s) must be "acceptable" in the 
sense of being compatible with the other ingredients of the 
formulation and not deleterious to the recipient thereof. 
The formulations may conveniently be presented in unit 

35 dosage form and may be prepared by any method well-known in 
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the pharmaceutical art. 

All methods include the step of bringing into 
association the active ingredient with the carrier which 
constitutes one or more accessory ingredients. In general, 
the formulations are prepared by uniformly and intimately 
5 bringing into association the active ingredient with liquid 
carriers or finely divided solid carriers or both, and 
then, if necessary, shaping the product into the desired 
formulation. 

Formulations suitable for intravenous 

10 intramuscular, subcutaneous, or intraperitoneal 

administration conveniently comprise sterile aqueous 
solutions of the active ingredient with solutions which are 
preferably isotonic with the blood of the recipient. Such 
formulations may be conveniently prepared by dissolving the 

15 solid active ingredient in water containing physiologically 
compatible substances such as sodium chloride (e.g. 0.1- 
2.0m) , glycine, and the like, and having a buffered pH 
compatible with physiological conditions to produce an 
aqueous solution, and rendering said solution sterile. 

20 These may be present in unit or multi-dose containers, for 
example, sealed ampoules or vials. 

The formulations of the present invention may 
incorporate a stabilizer. Illustrative stabilizers are 
preferably incorporated in an amount of 0.11-10,000 parts 

25 by weight per part by weight of immunogens . If two or more 
stabilizers are to be used, their total amount is 
preferably within the range specified above. These 
stabilizers are used in aqueous solutions at the 
appropriate concentration and pH. The specific osmotic 

30 pressure of such aqueous solutions is generally in the 

range of 0.1-3.0 osmoles, preferably in the range of 0.8- 
1.2. The pH of the aqueous solution is adjusted to be 
within the range of 5.0-9.0, preferably within the range of 
6-8. In formulating the immunogen of the present 

35 invention, anti- adsorption agent may be used. 
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° Additional pharmaceutical methods may be employed 

to control the duration of action. Controlled release 
preparations may be achieved through the use of polymer to 
complex or adsorb the proteins or their derivatives. The 
controlled delivery may be exercised by selecting 
5 appropriate macromolecules (for example polyester, 
polyamino acids, polyvinyl pyrrolidone, 
ethyl enevinylacetate , methylcellulose , 
carboxymethyl cellulose, or protamine sulfate) and the 
concentration of macromolecules as well as the methods of 

10 incorporation in order to control release. Another 
possible method to control the duration of action by 
controlled- release preparations is to incorporate the 
proteins, protein analogs or their functional derivatives, 
into particles of a polymeric material such as polyesters, 

15 polyamino acids, hydrogels, poly (lactic acid) or ethylene 
vinylacetate copolymers. Alternatively, instead of 
incorporating these agents into polymeric particles, it is 
possible to entrap these materials in microcapsules 
prepared, for example, by coacervation techniques or by 

20 interfacial polymerization, for example, 

hydroxymethylcellulose or gelatin-microcapsules and poly 
(methylmethacylate) microcapsules, respectively, or in 
colloidal drug delivery systems, for example, liposomes, 
albumin microspheres, microemulsions, nanoparticles, and 

25 nanocapsules or in macroemulsions . 

When oral preparations are desired, the 
compositions may be combined with typical carriers, such as 
lactose, sucrose, starch, talc, magnesium stearate, 
crystalline cellulose, methyl cellulose, carboxymethyl 

30 cellulose, glycerin, sodium alginate or gum arabic among 
others . 

The proteins of the present invention may be 
supplied in the form of a kit, alone, or in the form of a 
pharmaceutical composition as described above. 
35 Vaccination can be conducted by conventional 
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° methods. For example, the imntunogen or immunogens (i.e. 
the El protein may be administered alone or in combination 
with the El proteins derived from other isolates of HCV) 
cam be used in a suitable diluent such as saline or water, 
or complete or incomplete adjuvants. Further, the 
5 immunogen(s) may or may not be bound to a carrier to make 
the protein (s) immunogenic. Examples of such carrier 
molecules include but are not limited to bovine serum 
albumin (BSA) , keyhole limpet hemocyanin (KLH) , tetanus 
toxoid, and the like. The immunogen(s) can be administered 

10 by any route appropriate for antibody production such as 
intravenous, intraperitoneal, intramuscular, subcutaneous, 
and the like. The immunogen(s) may be administered once or 
at periodic intervals until a significant titer of anti-HCV 
antibody is produced. The antibody may be detected in the 

15 serum using an immunoassay. 

The administration of the immunogen(s) of the 
present invention may be for either a prophylactic or 
therapeutic purpose. When provided prophylactically, the 
immunogen(s) is provided in advance of any exposure to HCV 

20 or in advance of any symptom of any symptoms due to HCV 
infection. The prophylactic administration of the 
immunogen serves to prevent or attenuate any subsequent 
infection of HCV in a mammal. When provided 
therapeutically, the immunogen (s) is provided at (or 

25 shortly after) the onset of the infection or at the onset 
of any symptom of infection or disease caused by HCV. The 
therapeutic administration of the immunogen (s) serves to 
attenuate the infection or disease. 

In addition to use as a vaccine, the compositions 

30 can be used to prepare antibodies to HCV El proteins . The 
antibodies can be used directly as antiviral agents. To 
prepare antibodies, a host animal is immunized using the El 
proteins native to the virus particle bound to a carrier as 
described above for vaccines. The host serum or plasma is 
35 collected following an appropriate time interval to provide 
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° a composition comprising antibodies reactive with the El 

protein of the virus particle. The gamma globulin fraction 
or the IgG antibodies can be obtained, for example, by use 
of saturated ammonium sulfate or DEAE Sephadex, or other 
techniques known to those skilled in the art. The 
5 antibodies are substantially free of many of the adverse 
side effects which may be associated with other anti-viral 
agents such as drugs. 

The antibody compositions can be made even more 
compatible with the host system by minimizing potential 
10 adverse immune system responses. This is accomplished by 
removing all or a portion of the Fc portion of a foreign 
species antibody or using an antibody of the same species 
as the host animal, for example, the use of antibodies from 
human/human hybridomas. Humanized antibodies (i.e., 

15 nonimmunogenic in a human) may be produced, for example, by 
replacing an immunogenic portion of an antibody with a 
corresponding, but nonimmunogenic portion (i.e., chimeric 
antibodies) . Such chimeric antibodies may contain the 
reactive or antigen binding portion of an antibody from one 

20 species and the Pc portion of an antibody (nonimmunogenic) 
from a different species. Examples of chimeric antibodies, 
include but are not limited to, non- human mammal - human 
chimeras, rodent -human chimeras, murine -human and rat -human 
chimeras (Robinson et al . , International Patent Application 

25 184,187; Taniguchi M. , European Patent Application 171,496; 
Morrison et al., European Patent Application 173,494; 
Neuberger et al., PCT Application WO 86/01533; Cabilly et 
al., 1987 Proc. Natl. Acad. Sci. USA 84:3439; Nishimura et 
al., 1987 Cane. Res. 47:999; Wood et al., 1985 Nature 

30 314:446; Shaw et al., 1988 J. Natl. Cancer Inst. 80:15553, 
all incorporated herein by reference) . 

General reviews of "humanized" chimeric 
antibodies are provided by Morrison S., 1985 Science 
229:1202 and by Oi et al., 1986 BioTechniques 4:214. 

35 Suitable "humanized" antibodies can be 
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alternatively produced by CDR or CEA substitution (Jones et 
al., 1986 Nature 321:552; Verhoeyan et al., 1988 Science 
239:1534; Biedleret al. 1988 J. Immunol. 141:4053, all 
incorporated herein by reference) . 

The antibodies or antigen binding fragments may 
5 also be produced by genetic engineering. The technology 
for expression of both heavy and light cain genes in E. 
coli is the subject of the PCT patent applications; 
publication number WO 901443, WO901443, and WO 9014424 and 
in Huse et al., 1989 Science 246:1275-1281. 

10 The antibodies can also be used as a means of 

enhancing the immune response. The antibodies can be 
administered in amount similar to those used for other 
therapeutic administrations of antibody. For example, 
normal immune globulin is administered at 0.02-0.1 ml/lb 

15 body weight during the early incubation period of other 

viral diseases such as rabies, measles, and hepatitis B to 
interfere with viral entry into cells. Thus, antibodies 
reactive with the HCV El protein can be passively 
administered alone or in conjunction with another anti- 

20 viral agent to a host infected with an HCV to enhance the 
immune response and/or the effectiveness of an antiviral 
drug. 

Alternatively, anti-HCV El antibodies can be 
induced by administered anti-idiotype antibodies as 

25 immunogens. Conveniently, a purified anti-HCV El antibody 
preparation prepared as described above is used to induce 
anti-idiotype antibody in a host animal, the composition is 
administered to the host animal in a suitable diluent. 
Following administration, usually repeated administration, 

30 the host produces anti-idiotype antibody. To eliminate an 
immunogenic response to the Fc region, antibodies produced 
by the same species as the host animal can be used or the 
Fc region of the administered antibodies can be removed. 
Following induction of anti-idiotype antibody in the host 

35 animal, serum or plasma is removed to provide an antibody 
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° composition. The composition can be purified as described 
above for anti-HCV El antibodies, or by affinity 
chromatography using anti-HCV El antibodies bound to the 
affinity matrix. The anti-idiotype antibodies produced are 
similar in conformation to the authentic HCV El protein and 
5 may be used to prepare an HCV vaccine rather than using an 
HCV El protein. 

When used as a means of inducing anti-HCV virus 
antibodies in an animal, the manner of injecting the 
antibody is the same as for vaccination purposes, namely 

10 intramuscularly, intraperitoneally, subcutaneously or the 
like in an effective concentration in a physiologically 
suitable diluent with or without adjuvant. One or more 
booster injections may be desirable. 

The HCV El proteins of the invention are also 

15 intended for use in producing antiserum designed for pre- 
or post -exposure prophylaxis. Here an El protein, or 
mixture of El proteins is formulated with a suitable 
adjuvant and administered by injection to human volunteers, 
according to known methods for producing human antisera. 

20 Antibody response to the injected proteins is monitored, 
during a several -week period following immunization, by 
periodic serum sampling to detect the presence of anti-HCV 
El serum antibodies, using an immunoassay as described 
herein . 

25 The antiserum from immunized individuals may be 

administered as a pre- exposure prophylactic measure for 
individuals who are at risk of contracting infection. The 
antiserum is also useful in treating sin individual post- 
exposure, analogous to the use of high titer antiserum 

30 against hepatitis B virus for post -exposure prophylaxis. 

For both in vivo use of antibodies to HCV virus - 
like particles and proteins and anti-idiotype antibodies 
and diagnostic use, it may be preferable to use monoclonal 
antibodies. Monoclonal anti-HCV El protein antibodies or 

35 anti-idiotype antibodies can be produced as follows. The 
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and immortalized or used to prepare hybridomas by methods 
known to those skilled in the art. (Goding, J.W. 1983. 
Monoclonal Antibodies: Principles and Practice, Pladermic 
Press, Inc., NY, NY, pp. 56-97). To produce a human -human 
5 hybridoma, a human lymphocyte donor is selected. A donor 
known to be infected with HCV (where infection has been 
shown for example by the presence of anti -virus antibodies 
in the blood or by virus culture) may serve as a suitable 
lymphocyte donor. Lymphocytes can be isolated from a 

10 peripheral blood sample or spleen cells may be used if the 
donor is subject to splenectomy. Epstein- Barr virus (EBV) 
can be used to immortalize human lymphocytes or a human 
fusion partner can be used to produce human -human 
hybridomas. Primary in vitro immunization with peptides 

15 can also be used in the generation of human monoclonal 
antibodies . 

Antibodies secreted by the immortalized cells are 
screened to determine the clones that secrete antibodies of 
the desired specificity. For monoclonal anti-El 

20 antibodies, the antibodies must bind to HCV El protein. 
For monoclonal anti-idiotype antibodies, the antibodies 
must bind to anti -El protein antibodies. Cells producing 
antibodies of the desired specify are selected. 

The present invention also relates to the use of 

25 single -stranded antisense poly- or oligonucleotides derived 
from nucleotide sequences substantially homologous to those 
shown in SEQ ID NOs:l-51 to inhibit the expression of 
hepatitis C El genes. By substantially homologous as used 
throughout the specification and claims to describe the 

30 nucleic acid sequences of the present invention, is meant a 
level of homology between the nucleic acid sequence and the 
SEQ ID NOs. referred to in that sentence. Preferably, the 
level of homology is in excess of 80%, more preferably in 
excess of 90%, with a preferred nucleic acid sequence being 

35 in excess of 95% homologous with the DNA sequence shown in 
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the indicated SEQ ID NO. These ant i- sense poly- or 
oligonucleotides can be either DNA or RNA. The targeted 
sequence is typically messenger RNA and more preferably, a 
single sequence required for processing or translation of 
the RNA. The anti- sense poly- or oligonucleotides ca " be 
5 conjugated to a polycation such as polylysine as disclosed 
in Lemaitre, M. et al. ((1989) Proc. Natl. Acad. Sci. USA 
84:648-652) and this conjugate can be administrated to a 
ma mm al in em amount sufficient to hybridize to and inhibit 
the function of the messenger RNA. 
10 The present invention further relates to multiple 

computer- generated alignments of the nucleotide and deduced 
amino acid sequences shown in SEQ ID NOs: 1-102. Computer 
analysis of the nucleotide sequences shown in SEQ ID NOs:l- 
51 and of the deduced amino acid sequences shown in SEQ ID 

15 NOs: 52 -102 can be carried out using commercially available 
computer programs known to one skilled in the art. 

In one embodiment, computer analysis of SEQ ID 
NOs: 1-51 by the program GENALIGN (Intelligenetics, Inc. 
Mountainview, CA) results in distribution of the 51 

20 sequences into twelve genotypes based upon the degree of 
variation of the sequences. For the purposes of the 
present invention, the nucleotide sequence identity of El 
cDNAs of HCV isolates of the same genotype is in the range 
of about 85% to about 100% whereas the identity of El cDNA 

25 sequences of different genotypes is in the range of about 
50% to about 80%. 

The grouping of SEQ ID NOs: 1-51 into twelve HCV 
genotypes is shown below. 

30 



35 
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SEP ID NOs: Genotypes 

1-8 I/la 
9-25 Il/lb 
26-29 III/2a 
30-33 IV/2b 
34 2c 
35-39 V/3a 

40 4a 

41 4b 
42-43 4C 
44 4d 
45-50 5a 
51 6a 

For those genotypes containing more than one El 
nucleotide sequence, computer alignment of the constituent 
nucleotide sequences of the genotype was conducted using 
GENALIGN in order to produce a consensus sequence for each 
genotype. These alignments and their resultant consensus 
sequences are shown in Figures 1A-G for the seven genotypes 
(I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) which 
comprise more than one nucleotide sequence. Further 
alignment of the consensus sequences of Figures 1A-G with 
SEQ ID N0:34 (genotype 2c) , SEQ ID N0:40 (genotype 4a), SEQ 
ID N0:41 (genotype 4b), SEQ ID N0:44 (genotype 4d) and SEQ 
ID NO: 51 (genotype 6a) produces a consensus sequence for 
all twelve genotypes as shown in Figure 1H. The multiple 
alignments of nucleotide sequences shown in Figures 1A-H 
serve to highlight regions of homology and non- homology 
between different sequences and hence, can be used by one 
skilled in the art to design oligonucleotides useful as 
reagents in diagnostic assays for HCV. 

Examples of purified and isolated oligonucleotide 
sequences provided by the present invention are shown as 
SEQ ID NOs: 109 -135. The oligonucleotides shown in SEQ ID 
NOs: 109 -13 5 are useful as "genotype- specif ic" primers and 
probes since these oligonucleotides can hybridize 
specifically to the nucleotide sequence of the El gene of 
HCV isolates belonging to a single genotype. The genotype- 
specificity of the oligonucleotides shown in SEQ ID 
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NOs: 109 -135 is as follows: SEQ ID NOs: 109 -110 are specific 
for genotype I/la; SEQ ID NOs: 111-112 are specific for 
genotype Il/lb; SEQ ID NOs: 113 -114 are specific for 
genotype III/2a; SEQ ID NOs: 115 -116 are specific for 
genotype IV/2b; SEQ ID NOs: 117- 119 are specific for 
5 genotype 2c ; SEQ ID NOs: 12 0-122 are specific for genotype 
V/3a; SEQ ID NOs: 123 -124 are specific for genotype 4a; SEQ 
ID NOs:125-125 are specific for genotype 4b; SEQ ID 
NOs: 127- 128 are specific for genotype 4c ; SEQ ID NOs : 129- 
130 are specific for genotype 4d; SEQ ID NOs: 131 -132 are 
10 specific for genotype 5a and SEQ ID NOs: 133 -135 are 
specific for genotype 6a. 

The oligonucleotides of this invention can be 
synthesized using any of the known methods of 
oligonucleotide synthesis (e.g., the phosphodiester method 

15 of Agarwal et al. 1972, Agnew. Chem. Int. Ed. Engl. 11:451, 
the phosphotriester method of Hsiung et al. 1979, Nucleic 
Acids Res 6:1371, or the automated diethylphosphoramidite 
method of Baeucage et al. 1981, Tetrahedron Letters 
22:1859-1862), or they can be isolated fragments of 

20 naturally occurring or cloned DNA. In addition, those 

skilled in the art would be aware that oligonucleotides can 
be synthesized by automated instruments sold by a variety 
of manufacturers or can be commercially custom ordered and 
prepared. In a preferred embodiment, SEQ ID NO: 103 through 

25 SEQ ID NO: 135/ are synthetic oligonucleotides. 

The present invention also relates to a method 
for detecting the presence of HCV in a mammal, said method 
comprising analyzing the RNA of a mammal for the presence 
of hepatitis C virus. 

30 The RNA to be analyzed can be isolated from 

serum, liver, saliva, lymphocytes or other mononuclear 
cells as viral RNA, whole cell RNA or as poly (A) + RNA. 
Whole cell RNA can be isolated by methods known to those 
skilled in the art. Such methods include extraction of RNA 

35 by differential precipitation (Birnbiom, H.C. (1988) 
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° Nucleic Acids Res., 16:1487-1497), extraction of RNA by 
organic solvents (Chomczynski, P. et al. (1987) Anal. 
Biochem. , 162:156-159) and extraction of RNA with strong 
denaturants (Chirgwin, J.M. et al. (1979) Biochemistry, 
18:5294-5299) . Poly (A) * RNA can be selected from whole cell 

5 RNA by affinity chromatography on oligo-d(T) columns (Aviv, 
H. et al. (1972) Proc. Natl. Acad. Sci., 69:1408-1412). A 
* preferred method of isolating RNA is extraction of viral 
RNA by the quanidium- phenol -chloroform method of Bukh et 
al. (1992a). 

10 The methods for analyzing the RNA for the 

presence of HCV include Northern blotting (Alwine, J.C. et 
al. (1977) Proc. Natl. Acad. Sci., 74:5350-5354), dot and 
slot hybridization (Kafatos, F.C. et al. (1979) Nucleic 
Acids Res., 7:1541-1522), filter hybridization (Hollander, 

15 M.C. et al. (1990) Biotechniques; 9:174-179), RNase 
protection (Sambrook, J. et al. (1989) in "Molecular 
Cloning, A Laboratory Manual", Cold Spring Harbor Press, 
Plainview, NY) and reverse- transcription polymerase chain 
reaction (RT-PCR) (Watson, J.D. et al. (1992) in 

20 "Recombinant DNA" Second Edition, W.H. Freeman and Company, 
New York) . A preferred method is RT-PCR. In this method, 
the RNA can be reverse transcribed to first strand cDNA 
using a primer or primers derived from the nucleotide 
sequences shown in SEQ ID N0s:l-51. A preferred primer for 

25 reverse transcription is that shown in SEQ ID NO: 104. Once 
the cDNAs are synthesized, PCR amplif ication is carried out 
using pairs of primers designed to hybridize with sequences 
in the HCV El cDNA which are an appropriate distance apart 
(at least about 50 nucleotides) to permit amplif ication of 

30 the cDNA and subsequent detection of the amplif ication 
product. Each primer of a pair is a single- stranded 
oligonucleotide of about 20 to about 60 bases in length 
where one primer (the "upstream" primer) is complementary 
to the original RNA and the second primer (the "downstream" 

35 primer) is complementary to the first strand of cDNA 
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generated by reverse transcriptions of the RNA. The target 
sequence is generally about 100 to about 300 base pairs 
long but can be as large as 500-1500 base pairs. 
Optimization of the amplification reaction to obtain 
sufficiently specific hybridization to the El nucleotide 
5 sequence is well within the skill in the art and is 

preferably achieved by adjusting the annealing temperature . 

In one embodiment, the primer pairs selected to 
amplify El cDNAs are universal primers. By "universal", as 
used to describe primers throughout the claims and 
10 specification, is meant those primer pairs which can 
amplify El gene fragments derived from an HCV isolate 
belonging to any one of the twelve genotypes of HCV 
described herein. Purified and isolated universal primers 
are used in Example 1 of the present invention and are 

15 shown as SEQ ID NOs:103-108 where SEQ ID N0s:103 and 104 
represent one pair of primers, SEQ ID N0s:105 and 106 
represent a second pair of primers and SEQ ID NOs: 107-108 
represent a third pair of primers. 

In an alternative embodiment, primer pairs 

20 selected to simplify El cDNAs are genotype -specific primers. 
In the present invention, genotype- specif ic primer pairs 
can readily be derived from the following genotype- specific 
nucleotide domains: nucleotides 197-238 and 450-480 of the 
consensus sequence of genotype I/la shown in Figure 1A; 

25 nucleotides 197-238 and 450-480 of the consensus sequence 
of genotype Il/lb shown in Figure IB; nucleotides 199-238 
and 438-480 of the consensus sequence of genotype IIl/2a 
shown in Figure C; nucleotides 124-177 and 450-480 of the 
consensus sequence of genotype IV /2b shown in Figure ID; 

30 nucleotides 124-177, 193-238 and 436-480 Of SEQ ID NO:34 

(genotype 2C) ; nucleotides 168-207, 294-339 and 406-480 of 
the consensus sequence of genotype V/3a shown in Figure IE; 
nucleotides 145-183 and 439-480 of SEQ ID NO: 40 (genotype 
4a); nucleotides 168-207 and 432-480 of SEQ ID N0:41 

35 (genotype 4b) ; nucleotides 130-183 and 450-480 of the 
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consensus sequence of genotype 4c shown in Figure IF; 
nucleotides 130-183 and 450-480 of SEQ ID N0:44 (genotype 
4d) ; nucleotides 166-208 and 437-480 of the consensus 
sequence of genotype 5a shown in Figure lb and nucleotides 
168-207, 216-252 and 429-480 of SEQ ID NO:51 (genotype 6a), 
5 One skilled in the art would readily appreciate that in a 
pair of genotype- specif ic primers, each primer is derived 
from different genotype- specif ic nucleotide domains 
indicated above for a given genotype. Also, as described 
earlier, it is understood by one skilled in the art that 

10 each pair of primers comprises one primer which is 

complementary to the original viral RNA and the other which 
is complementary to the first strand of cDNA generated by 
reverse transcription of the viral RNA. For example, in a 
pair of genotype -specific primers for genotype 4b, one 

15 primer would have a nucleotide sequence derived from region 
168-207 of SEQ ID N0:40 and the other primer would have a 
nucleotide sequence which is the complement of region 432- 
480 of SEQ ID NO: 40. One skilled in the art would readily 
recognize that such genotype specific domains would also be 

20 useful in designing oligonucleotides for use as genotype- 
specific hybridization probes. Indeed, the sequences of 
such genotype- specif ic hybridization probes are disclosed 
later in the specification* 

The amplif icatioh products of PCR can be detected 

25 either directly or indirectly. In one embodiment, direct 
detection of the amplif ication products is carried out via 
labelling of primer pairs. Labels suitable for labelling 
the primers of the present invention are known to one 
skilled in the art and include radioactive labels, biotin, 

30 avidin, enzymes and fluorescent molecules. The derived 
labels can be incorporated into the primers prior to 
performing the amplification reaction. A preferred 
labelling procedure utilizes radiolabeled ATP and T4 
polynucleotide kinase (Sambrook, J. et al. (1989) in 

35 "Molecular Cloning, A Laboratory Manual", Cold Spring 
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Harbor Press, Plainview, NY) . Alternatively, the desired 
label can be incorporated into the primer extension 
products during the amplification reaction in the form of 
one or more labelled dNTPs. In the present invention, the 
labelled amplif ied PCR products can be detected by agarose 
5 gel electrophoresis followed by ethidum bromide staining 
and visualization under ultraviolet light or via direct 
sequencing of the PCR-products. 

In yet another embodiment, unlabelled 
amplif ication products can be detected via hybridization 
10 with labelled nucleic acid probes radioactively labelled 
or, labelled with biotin, in methods known to one skilled 
in the art such as dot and slot blot hybridization 
(Kafatos, F.C. et al. (1979) or filter hybridization 
(Hollander, M.C. et al. (1990)). 

15 In one embodiment, the nucleic acid sequences 

used as probes are selected from, and substantially 
homologous to, SEQ ID NOs:l-5l. Such probes are useful as 
universal probes in that they can detect in PCR- 
amplif ication products of El cDNAs of an HCV isolate 

20 belonging to any of the twelve HCV genotypes disclosed 

herein. The size of these probes can range from about 200 
to about 500 nucleotides. 

In an alternative embodiment, the present 
invention relates to a method for determining the genotype 

25 of a hepatitis C virus present in a mammal where said 
method comprises: 

(a) amplifying RNA of a mammal via RT-PCR to 
produce amplification products; 

(b) contacting said products with at least one 
30 genotype- specif ic oligonucleotide; and 

(c) detecting complexes of said products which 
bind to said oligonucleotide (s) . 

In this method, one embodiment of said 
amplification step is carried out using the universal 
35 primers (SEQ ID NO: 103 through SEQ ID NO: 108) as disclosed 
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° above. In step (b) of this method, the nucleic acid 

sequences used as probes are substantially homologous to 
the sequences shown in SEQ ID NOs: 109-135. The probes 
disclosed in SEQ ID N0s:109-135 are useful in specifically 
detecting PCR-amplif ication products of El cDNAs of HCV 
5 isolates belonging to one of the twelve HCV genotypes 

disclosed herein. In a preferred embodiment, probes having 
sequences substantially homologous to the sequences shown 
in SEQ ID NOs:109-135 are used alone or in combination with 
other probes specific to the same genotype. 

10 For example, a probe having a sequence according 

to SEQ ID NO: 109 can be used alone or in combination with a 
probe having a sequence according to SEQ ID NO: 110. The 
probes derived from SEQ ID NOs:109-135 can range in size 
from about 30 to about 70 nucleotides and can be 

15 synthesized as described earlier. 

The nucleic acid sequence used as a probe to 
detect PCR amplification products of the present invention 
can be labeled in single- stranded or double- stranded form. 
Labelling of the nucleic acid sequence can be carried out 

20 by techniques known to one skilled in the art. Such 

labelling techniques can include radiolabels and enzymes 
(Sambrook, J. et al. (1989) in "Molecular Cloning, A 
Laboratory Manual", Cold Spring Harbor Press, Plainview, 
New York) . In addition, there are known non- radioactive 

25 techniques for signal ampl if ication including methods for 
attaching chemical moieties to pyrimidine and purine rings 
(Dale, R.N.K. et al. (1973) Proc. Natl. Acad. Sci. . 
70:2238-2242; Heck, R.F. (1968) S. Am. Chem. Soc. . 90:5518- 
5523) , methods which allow detection by chemiluminescence 

30 (Barton, S.K. et al. (1992) J. Am. Chem. Soc. . 114:8736- 
8740) and methods utilizing biotinylated nucleic acid 
probes (Johnson, T.K. et al. (1983) Anal . Biochem. . 
133:126-131; Erickson, P.F. et al. (1982) J. of Immunology 
Methods , 51:241-249; Matthaei, F.S. et al . (1986) Anal. 

35 Biochem. , 157:123-128) and methods which allow detection by 
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fluorescence using commercially available products. 

The present invention also relates to computer 
analysis of the amino acid sequences shown in SEQ ID 
NOs:52-102 by the program GENALIGN. This analysis groups 
the. 51 amino acid sequences shown in SEQ ID N0s:52-1Q2 into 
5 the twelve genotypes disclosed earlier in this application 
based upon the degree of variation of the amino acid 
sequences. For the purposes of the present invention, the 
amino acid sequence identity of El amino acid sequences of 
the same genotype ranges from about 85% to about 100% 
10 whereas the identity of El sequences of different genotypes 
ranges from about 45% to about 80%. 

The grouping of SEQ ID NOs:52-102 into the twelve 
HCV genotypes is shown below: 



15 SEP ID NOs: Genotypes 

52-59 I/la 

60-76 Il/lb 

77-80 III/2a 

81-84 IV/2b 

85 2c 

86-90 V/3a 

20 9 1 4a 

92 4b 

93-94 4c 

95 4d 

96-101 5a 

102 6a 



25 For those genotypes containing more than one El 

- amino acid sequence, computer alignment of the constituent 
sequences of each genotype was conducted using the computer 
program GENALIGN in order to produce a consensus sequence 
for each genotype. These alignments and their resultant 

30 consensus sequences are shown in Figures 2A-G for the seven 
genotypes (I/la, Il/lb, III/2a, IV/2b, V/3a, 4c and 5a) 
which comprise more than one sequence. Further alignment 
of the consensus sequences shown in Figures 2A-G with the 
amino acid sequences of SEQ ID NO: 85 (genotype 2c); SEQ ID 

35 N0:91 (genotype 4a); SEQ ID N0:92 (genotype 4b); SEQ ID 
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NO: 95 (genotype 4d) and SEQ ID NO: 102 (genotype 6a) to 
produce a consensus amino acid sequence for all twelve 
genotypes is shown in Figure 2H. The multiple alignment of 
El amino acid sequences shown in Figures 2A-H serves to 
highlight regions of homology and non- homology between 
5 amino acid sequences and hence, these alignments can 
readily be used by one skilled in the art to derive 
peptides useful in assays and vaccines for the diagnosis 
and prevention of HCV infection. Examples of purified and 
isolated peptides are provided by the present invention are 

10 shown as SEQ ID NOs: 136-159. These peptides are derived 
from two regions of the amino acid sequences shown in 
Figures 2A-H, amino acids 48-80 and amino acids 138-160. 
The peptides shown in SEQ ID N0s:136-159 are useful as 
genotype- specif ic diagnostic reagents since they are 

15 capable of detecting an immune response specific to HCV 
isolates belonging to a single genotype. The genotype- 
specificity of the peptides shown in SEQ ID NOs:136-159 are 
as follows: SEQ ID NOs:136 and 148 are specific for 
genotype IV/2b; SEQ ID NOs:137 and 149 are specific for 

20 genotype 2c; SEQ ID NOs:l38 and 150 are specific for 

genotype III/2a; SEQ ID NOs:139 and 151 are specific for 
genotype V/a; SEQ ID N0s:140 and 152 are specific for 
genotype Il/lb; SEQ ID N0s:141 and 153 are specific for 
genotype I/la; SEQ ID N0s:142 and 154 are specific for 

25 genotype 4a; SEQ ID N0s:143 and 155 are specific for 
genotype 4c; SEQ ID N0s:144 and 156 are specific for 
genotype 4d; SEQ ID N0s:l45 and 157 are specific for 
genotype 4b; SEQ ID N0s:146 and 158 are specific for 
genotype 5a and SEQ ID N0s:147 and 159 are specific for 

30 genotype 6a. In SEQ ID NO: 13 6, Xaa at position 22 is a 

residue of Ala or Thr f Xaa at position 24 is a residue of 
Val or lie, Xaa at position 26 is a residue of Val or Met; 
in SEQ ID NO : 13 8 , Xaa at position 5 is a Ser or Thr 
residue, Xaa at position 11 is an Arg or Gin residue, Xaa 

35 at position 12 is an Arg or Gin residue; in SEQ ID NO: 139, 



WO 95/01442 



PCT/US94/07320 



- 37 - 

° Xaa at position 3 is a Pro or Ser residue, Xaa at position 
33 is a Leu or Met residue; in SEQ ID NO: 140, Xaa at 
position 5 is a Thr or Ala residue, Xaa at position 13 is a 
Gly, Ala, Ser, Val or Thr residue, Xaa at position 14 is a 
Ser, Thr or Asn residue, Xaa at position 15 is a Val or lie 
5 residue, Xaa at position 16 is a Pro or Ser residue, Xaa at 
position 18 is a Thr or Lys residue, Xaa at position 19 is 
a Thr or Ala residue, Xaa at position 22 is an Arg or His 
residue, Xaa at position 32 is an Ala, Val or Thr residue; 
in SEQ ID NO: 141, Xaa at position 3 is an Ala or Pro 

10 residue, Xaa at position 4 is a Val or Met residue, Xaa at 
position 5 is a Thr or Ala residue, Xaa at position 17 is a 
Thr or Ala residue, Xaa at position 18 is a Thr or Ala 
residue, Xaa at position 23 is a His or Tyr residue; in SEQ 
ID N0:143, Xaa at position 10 is a Val or Ala residue, Xaa 

15 at position 11 is a Ser or Pro residue, Xaa at position 18 
is an Asp or Glu residue Xaa at position 20 is a Leu or lie 
residue; in SEQ ID NO: 146, Xaa at position 3 is a Gin or 
His residue, Xaa at position 12 is an Asn, Ser or Thr 
residue, Xaa at position 13 is a Leu or Phe residue, Xaa at 

20 position 23 is an Ala or Val residue; in SEQ ID NO: 148, Xaa 
at position 16 is a Val or Ala residue, Xaa at position 18 
is a Glu or Gin residue; in SEQ ID NO: 150, Xaa at position 
2 is an Ala or Thr residue, Xaa at position 4 is a Met or 
Leu residue, Xaa at position 9 is an Ala or Val residue, 

25 Xaa at position 17 is an lie or Leu residue, Xaa at 

position 20 is an lie or Val residue, Xaa at position 21 is 
a Ser or Gly residue; in SEQ ID NO: 151, Xaa at position 9 
is a Val or lie residue, Xaa at position 16 is a Leu or Val 
residue, Xaa at position 20 is an lie or Leu residue; in 

30 SEQ ID NO: 152, Xaa at position 2 is an Ala or Thr residue, 
Xaa at position 6 is a Val or Leu residue, Xaa at position 
12 is an lie or Leu residue, Xaa at position 16 is a Val or 
lie residue, Xaa at position 17 is a Val, Leu or Met 
residue, Xaa at position 19 is a Met or Val residue, Xaa at 

35 position 21 is an Ala or Thr residue; in SEQ ID NO: 153, Xaa 
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at position 2 is a Thr or Ala residue, Xaa at position 6 is 
a Val, lie or Met residue, Xaa at position 12 is an lie or 
Val residue, Xaa at position 16 is a lie or Val residue; in 
SBQ ID NO: 155, Xaa at position 5 is a Leu or Val residue, 
Xaa at position 21 is a Thr or Ala residue; in SEQ ID 
5 NO: 158, Xaa at position 1 is a Thr or Ala residue, Xaa at 
position 5 is a Val or Leu residue, Xaa at position 9 is a 
Leu, Met or Val residue, Xaa at position 23 is a Gly or Ala 
residue. 

Those skilled in the art would be aware that the 
10 peptides of the present invention or analogs thereof can be 
synthesized by automated instruments sold by a variety of 
manufacturers or can be commercially custom- ordered and 
prepared. The term analog has been described earlier in. 
the specification and for purposes of describing the 
15 peptides of the present invention, analogs can further 

include branched or non- linear arrangements of the peptide 
sequences shown in SEQ ID NOs: 136-159. 

Alternatively, peptides can be expressed from 
nucleic acid sequences where such sequences can be DNA, 
20 cDNA, RNA or any variant thereof which is capable of 
directing protein synthesis. In one embodiment, 
restriction digest fragments containing a coding sequence 
for a peptide can be inserted into a suitable expression 
vector that functions in prokaryotic or eukaryotic cells. 
25 Such restriction digest fragments may be obtained from 
clones isolated from prokaryotic or eukaryotic sources 
which encode the peptide sequence. 

Suitable expression vectors and methods of 
isolating clones encoding the peptide sequences of the 
30 present invention have previously been described. 

The preferred size of the peptides of the present 
invention is from about 8 to about" 100 amino acids in 
length. 

The present invention further relates to the use 
35 of the peptides shown in SEQ ID NOs: 136-159 in methods of 
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detecting antibodies specific for HCV in biological 
samples. In one embodiment, at least one peptide specific 
for a single genotype to be used in previously described 
immunoassays to detect antibodies specific for a single 
genotype of HCV. A preferred immunoassay is ELISA. 
5 It is understood by one skilled in the art that 

the diagnostic assays described herein using genotype - 
specific oligonucleotides or genotype- specif ic peptides ca 
be useful in assisting one skilled in the art to choose a 
course of therapy for the HCV- infected individual. 
10 In an alternative embodiment, a mixture of 

peptides can be used in an immunoassay to detect antibodies 
to any of the twelve genotypes of HCV. The mixture of 
peptides as disclosed herein, comprises at least one 
peptide selected from SEQ ID NOs: 140 -141 and 152-153; one 
15 peptide selected from SEQ ID NOs:136, 138, 148 and 150; one 
peptide selected from SEQ ID NOs:142-145 and 154-157; one 
peptide selected from SEQ ID N0s:146 and 158; one peptide 
selected from SEQ ID NOs: 139 and 151; one peptide selected 
from SEQ ID N0s:138 and 150 and one peptide selected from 
20 SEQ ID NOs:140 and 159. In a preferred embodiment, the 

peptides of the present invention can be used in an ELISA 
assay as described previously for El proteins. 

The peptides or analogs thereof may be prepared 
in the form of a kit, alone or in combinations with other 
25 reagents such as secondary antibodies, for use in 

immunoassay. In addition, since genotype -specific peptides 
shown in SEQ ID NOs : 13 6 -159 are derived from two variable 
regions in the El protein, amino acids 48-80 (SEQ ID 
N0s:136-147) and amino acids 138-160 (SEQ ID NOs :148-159) , 
30 one skilled in the art would recognize that these peptides 
would be useful as vaccines against hepatitis C. In the 
present invention, a peptide from SEQ ID NOs : 136-159 can be 
used alone or in combination with other peptides shown 
therein as immunogens in the vaccine. Formulations 
35 suitable for administering the peptide (s) of the present 
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invention, routes of administration, pharmaceutical 
compositions comprising the peptid s and so forth are the 
same as those previously described for recombinant El 
proteins. In addition, as described for El proteins, the 
peptide (s) can also be used to prepare antibodies to HCV-El 
5 protein . 

The peptides of the present invention may also be 
supplied in the form of a kit, alone, or in the form of a 
pharmaceutical composition as described above for El 
proteins recombinant. 
10 Any articles or patents referenced herein are 

incorporated by reference. The following examples 
illustrate various aspects of the invention but are in no 
way intended to limit the scope thereof. 

15 



20 



25 



30 



35 
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MATERIALS 

Serum used in these exaiqples was obtained from 
84 anti-HCV positive individuals that were previously found 
to be positive for HCV RNA in a cDNA PCR assay with primer 
set a from the 5' NC region of the HCV genome (Bukh, J. et 
al. (1992 (b)) Natl. Acad, Sci. USA 89:4942-4946). These 
samples were from 12 countries: Denmark (DK) ; Dominican 
Republic (DR) ; Germany (D) ; Hong Kong (HK) ; India (IND); 
Sardinia, Italy (S) ; Peru (P) ; South Africa (SA) ; Sweden 
(SW) ; Taiwan (T) ; United States (US) ; and Zaire (Z) . 

Example 1 

Identification of the DNA Sequence 
of the El Gene of 51 Isolates of HCV via 
RT-PCR Analysi s of Viral RNA Using Universal Primers 

15 Viral RNA was extracted from 100 fil of serum by 

the guanidinium- phenol- chloroform method and the final RNA 
solution was divided into 10 equal aliquots and stored at 
-80°C as described (Bukh, et al. (1992 (a)). The sequences 
of the synthetic oligonucleotides used in the RT-PCR assay, 

20 deduced from the sequence of HCV strain H-77 (Ogata, N. et 
al. (1991) Proc. Natl. Acad. Sci. USA 88:3392-3396), are 
shown as SEQ ID NOs:103-108. One aliquot of the final RNA 
solution, equivalent to 10 fil of serum, was used for cDNA 
synthesis that was performed in a 20 fil reaction mixture 

25 using avian myeloblastosis virus reverse transcriptase 

(Promega, Madison, WI) and SEQ ID NO: 104 as a primer. The 
resulting cDNA was amplified in a "nested" PCR assay by Taq 
DNA polymerase (Amplitaq, Perkin- Elmer /Cetus) as described 
previously (Bukh et al. (1992a)) with primer set e (SEQ ID 

30 N0s:103-106) . Precautions were taken to avoid 

contamination with exogenous HCV nucleic acid (Bukh et al. 
1992a)), and negative controls (normal, uninfected serum) 
were interspersed between every test sample in both the RNA 
extraction and cDNA PCR procedures. No false positive 

35 results were observed in the analysis. In most instances, 
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° amplified DNA (first or second PCR products) was 

reamplified with primers SEQ ID NO: 107 and SEQ ID NO: 108 
prior to sequencing since these two primers contained EcoRl 
sites which would facilitate future cloning of the El gene. 
Amplif ied DNA was purified by gel electrophoresis followed 
5 by glass-milk extraction (Geneclean, BIO 101, LaJolla, CA) 
and both strands were sequenced directly by the dideaxy- 
nucleotide chain termination method (Bachman, B. et al. 
(1990) Nucl. Acids Res. 18:1309)) with phage T7 DNA 
polymerase (Sequenase, United States Biochemicals, 

10 Cleveland, OH), [alpha ^SldATP (Amersham, Arlington 
Heights, IL) or [alpha ^P] dATP (Amersham or DuPont, 
Wilmington, DE) and sequencing primers. RNA extracted from 
serum containing HCV strain H-77, previously sequenced by 
Ogata, N. et al. (1991), was amplified with primer set e 

15 (SEQ ID NOs: 103-106) and sequenced in parallel as a 

control. The nucleotide sequences of the envelope 1 (El) 
gene of all 51 HCV isolates are shown as SEQ ID NOs:l - 51. 
In all 51 HCV isolates, the El gene was exactly 576 
nucleotides in length and did not have any in- frame stop 

20 codons . 



Example 2 

Computer Analysis of the Nucleotide 
and Deduced Amino Acid Sequences 
of the El Gene of the 51 HCV Isolates 

25 

Multiple computer- generated alignments of the 
nucleotide (SEQ ID NOs: 1-51, Figures 1A-H) and deduced 
amino acid sequences (SEQ ID NOs: 52 -102, Figures 2A-H) of 
the cDNAs of. the 51 HCV isolates constructed using the 
30 computer program GENALIGN (Miller, R.H. et al. (1990) Proc. 
Natl. Acad. Sci. USA 87:2057-2061) resulted in the 51 HCV 
isolates being divided into twelve genotypes based upon the 
degree of variation of the El gene sequence as shown in 
table 1. 



35 
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° The nucleotide and amino acid sequence identity 

of HCV isolates of the same genotype was in the range of 
88.0-99*1% and 89-1-98.4%, respectively, whereas that of 
HCV isolates of different genotypes was in the range of 
53.5-78.6% and 49.0-82.8%, respectively. The latter 
5 differences are similar to those found when comparing the 
envelope gene sequences of the various serotypes of the 
related flaviviruses, as well as other RNA viruses. When 
microheterogenicity in a sequence was observed, defined as 
more than one prominent nucleotide at a specific position, 

10 the nucleotide that was identical to that of the HCV 
prototype (HCV1, Choo et al. (1989)) was reported if 
possible. Alternatively, the nucleotide that was identical 
to the most closely related isolate is shown. 

Analysis of the consensus sequence of the El 

15 protein of the 51 HCV isolates from this study demonstrated 
that a total of 60 (30.3%) of the 192 amino acids of the El 
protein were invariant among these isolates (Fig. 3) . Most 
impressive, all 8 cysteine residues as well as 6 of 8 
proline residues were invariant. The most abundant amino 

20 acids (e.g. alanine, valine and leucine) showed a very low 
degree of conservation. The consensus sequence of the El 
protein contained 5 potential N- linked glycosylation sites . 
Three sites at positions 209, 305 and 325 were maintained 
in all 51 HCV isolates. A site at position 196 was 

25 maintained in all isolates except the sole isolate of 

genotype 2c. Also, a site at position 234 was maintained 
in all isolates except one isolate of genotype I/la, all 
four isolates of genotype IV/2b and the sole isolate of 
genotype 6a. Conversely, only genotype IV/2b isolates had 

30 a potential glycosylation site at position 233. Further 

analysis revealed a highly conserved amino acid domain (aa 
302-328) in the El protein with 20 (74.1%) of 27 amino 
acids invariant among all 51 HCV isolates. It is possible 
that the 5' and 3' ends of this domain are conserved due to 

35 important cysteine residues and N- linked glycosylation 
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sites* The central sequence, 5 ' -GHRMAWDMM- 3 ' (aa 315-323), 
may be conserved due to additional functional constraints 
on the protein structure. Finally, although the amino acid 
sequence surrounding the putative El protein cleavage site 
was variable, an amino acid doublet (GV) at position 380 
5 was invariant among all HCV isolates. 

A dendrogram of the genetic relatedness of the El 
protein of selected HCV isolates representing the 12 
genotypes is shown in Fig. 4. This dendrogram was 
constructed using the program CLUSTAL (Weiner, A.J. et al. 
10 (1991) Virology 180:842-848) and had a limit of 25 

sequences. The scale showing percent identity was added 
based upon manual calculation. From the 51 HCV isolates 
for which the complete sequence of the El gene region was 
obtained, 25 isolates representing the twelve genotypes 

15 were selected for analysis as follows. Among isolates with 
genotype I/la (SEQ ID NOs:52-59), as well as among isolates 
with genotype Il/lb (SEQ ID NOs: 60-76) the two isolates 
with the lowest amino acid identity within each genotype 
were included. Among isolates of genotype IV/2b, isolate 

20 DK8 (SEQ ID NO: 81) that has an amino acid identity of 96.4% 
to isolate T8 (SEQ ID NO: 84) was excluded. Among isolates 
of genotype V/3a, isolates S2 (SEQ ID NO: 88) and S54 (SEQ 
ID NO: 90) that both shared 97.9 % of the amino acids of 
isolates HK10 (SEQ ID NO: 87) and S52 (SEQ ID NO: 89) were 

25 excluded. Finally, among isolates of genotype VI, isolates 
SA4 (SEQ ID NO:97) and SA5 (SEQ ID NO:98) with an amino 
acid identity to isolate SA7 (SEQ ID NO: 100) of 96.4% and 
95.8%, respectively were excluded. This dendrogram in 
combination with the analysis of the El gene sequence of 51 

30 HCV isolates in Table 1 demonstrates extensive 
heterogeneity of this important gene. 

The worldwide distribution of the 12 genotypes 
among 74 HCV isolates is depicted in Fig. 5. The complete 
El gene sequence was determined in 51 of these HCV isolates 

35 (SEQ ID NOs:l-51), including 8 isolates of genotype I/la, 
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° 17 isolates of genotype Il/lb and 26 isolates comprising 
genotypes III/2a, IV/2b, 2c, 3a, 4a-4d, 5a and 6a. In the 
remaining 23 isolates, all of genotypes I/la and Il/lb, the 
genotype assignment was based on a partial El gene sequence 
since they did not represent additional genotypes in any of 
5 the 12 countries. The number of isolates of a particular 
genotype is given in each of the 12 countries studied. Of 
the twelve genotypes, genotypes I/la and Il/lb were the 
most common accounting for 48 (65%) of the 74 isolates. 
Analysis o-f the El gene sequences available in the GenBank 

10 data base at the time of this study revealed that all 44 
such sequences were of genotypes I/la, Il/lb, III/2a and 
IV/2b. Thus, based upon El gene analysis, 8 new genotypes 
of HCV have been identified. 

Also of interest, different HCV genotypes were 

15 frequently found in the same country, with the highest 

number of genotypes (five) being detected in Denmark. Of 
the twelve genotypes, genotypes I/la, Il/lb, III/2a, IV/2b 
and V/3a were widely distributed with genotype Il/lb being 
identified in 11 of 12 countries studied (Zaire was the 

20 only exception) . In addition, while genotypes I/la and 
Il/lb were predominant in the Americas, Europe and Asia, 
several new genotypes were predominant in Africa. 

It was also found that genotypes I/la, Il/lb, 
III/2a, IV/ 2b and V/3a of HCV were widely distributed 

25 around the world, whereas genotypes 2c, 4a, 4b, 4d, 5a and 
6a were identified only in discreet geographical regions. 
For example, the majority of isolates in South Africa 
comprised a new genotype (5a) and all isolates in Zaire 
comprised 3 new closely related genotypes (4a, 4b, 4c) . 

30 These genotypes were not identified outside Africa. 

Example 3 

Detection by ELISA Based on Antigen from 
Insect Cells Expressing Complete El Protein 

35 Expression of El protein in SF9 cells , A cDNA 
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° (SEQ ID NO:l) encoding the complete El protein of SEQ ID 
NO: 52 is subcloned into pBlueBac - Transfer vector 
(Invitrogen) using standard subcloning procedures. The 
resultant recombinant expression vector is cotransf ected 
into SF9 insect cells (Invitrogen) by the Ca precipitation 
5 method according to the Invitrogen protocol. 

ELISA Based on Infected SF9 cells. 5 x 10 6 SP9 
cells infected with the above -described recombinant 
expression vector cure resuspended in 1 ml of 10 niM Tris- 
HC1, pH 7.5, 0.15M NaCl and are then frozen and thawed 3 

10 times. 10 ul of this suspension is dissolved in 10 ml of 
carbonate buffer (pH 9.6) and used to cover one flexible 
microtiter assay plate (Falcon) * Serum samples are diluted 
1:20, 1:400 and 1:8000, or 1:100, 1:1000 and 1:10000. 
Blocking and washing solutions for use in the ELISA assay 

15 are PBS containing 10% fetal calf serum and 0.5% gelatin 
(blocking solution) and PBS with 0.05% Tween -20 (Sigma, 
St. Louis, MO) (washing solution) . As a secondary antibody, 
peroxidase -conjugated goat IgG fraction to human IgG or 
horse radish peroxidase- labelled goat anti-Old or anti-New 

20 World monkey immunoglobulin is used. The results are 

determined by measuring the optical density (O.D.) at 405 
nm. 

To determine if insect cells -derived El protein 
representing genotype I/a of HCV could detect anti-HCV 
25 antibody in chimpanzees infected with genotype I/la of HCV, 
three infected chimpanzees are examined. The serum of all 
3 chimpanzees are found to seroconvert to anti-HCV. 



Example 4 

30 Use of the Complete 

El Protein as a Vaccine 

Mammals are immunized with purified or partially 
purified El protein in an amount sufficient to stimulate 
the production of protective antibodies. The immunized 
35 mammals challenged with various genotypes of HCV are 
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° protected. 

It is understood by one skilled in the art that 
the recombinant El protein used in the above vaccine can 
also be used in combination with other recombinant El 
proteins having an amino acid sequence shown in SEQ ID 
5 NOs: 52 -102. 



10 



15 



20 



25 



30 



Example 5 

Determination of the Genotype of an HCV 
Isolate Via Hybridization of Genotype- Specif ic 
Oligonucleotides to RT-PCR Amplification Products. 

Viral RNA is isolated from serum obtained from a 
mammal and is subjected to RT-PCR as in Example 1. 
Following amplification, the amplified DNA is purified as 
described in Example 1 and aliquot s of 100 mg of 
amplification product are applied to twelve dots on a 
nitrocellulose filter set in a dot blot apparatus. The 
twelve dots are then cut into separate dots and each dot is 
hybridized to a 32 p- labelled oligonucleotide specific for a 
single genotype of HCV. The oligonucleotides to be used as 
hybridization probes are selected from SEQ ID NOs: 109-135. 

Example 6 

ELISA Based on Synthetic 
Peptides Derived From El cDNA Sequences 

Synthetic peptides specific for genotype I/la and 
having amino acid sequences according to SEQ ID NOs : 136-148 
are placed in 0.1% PBS buffer and 50ul of lmg/ml of peptide 
is used to cover each well of the microtiter assay plate. 
Serum samples from two mammals infected with genotype I/la 
HCV and from one mammal infected with genotype 5a HCV are 
diluted as in Example 3 and the ELISA is carried out as in 
Example 3 . Both mammals infected with genotype I HCV react 
positively with peptides while the mammal infected with 
genotype 5a HCV exhibit no reactivity. 



35 
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Example 7 
Use of the El Peptides as a Vaccine 

Since the El genotype- specif ic peptides of the 
present invention are derived from two variable regions in 
the complete El protein, there exists support for the use 
of these peptides as a vaccine to protect against a variety 
of HCV genotypes. Mammals axe immunized with peptide (s) 
selected from SEQ ID NOs: 136-159 in an amount sufficient 
to stimulate production of protective antibodies. The 
immunized mammals challenged with various genotypes of HCV 
are protected. 



15 



20 



25 



30 
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15 
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30 

(2) INFORMATION FOR SEQ ID NO:l: 



(viii) 

25 



SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 
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<vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



TAC CAA GTG CGC AAC TCC ACG GGG CTT TAC CAT GTC ACC 39 

c AAT GAT TGC CCT AAC TCG AGT ATC GTG TAC GAG GCG GCC 78 

GAT GCC ATC CTG CAC ACT CCG GGG TGT GTC CCT TGC GTT 117 

CGC GAG GGT AAC GTC TCG AGG TGT TGG GTG GCG ATG ACC 156 

CCC ACG GTG GCC ACC AGG GAT GGC AAA CTC CCC ACA GCG 195 

CAG CTT CGA CGT CAC ATC GAT CTG CTC GTC GGG AGT GCC 234 

ACC CTC TGT TCG GCC CTC TAC GTG GGG GAC CTG TGC GGG 273 

TCT GTC TTT CTT GTC GGT CAA CTG TTT ACC TTC TCT CCC 312 

AGG CGC CAC TGG ACG ACG CAA GGC TGC AAT TGT TCT ATC 351 

10 TAT CCT GGC CAT ATA ACG GGT CAC CGC ATG GCG TGG GAT 390 

ATG ATG ATG AAC TGG TCC CCT ACC ACG GCG TTG GTA GTA 429 

GCT CAG CTG CTC CGG ATC CCG CAA GCC ATC TTG GAC ATG 468 

ATC GCT GGT GCT CAC TGG GGA GTC CTG GCG GGC ATA GCG 507 

TAT TTT TCC ATG GTG GGG AAC TGG GCG AAG GTC CTG GTA 546 

GTG CTG CTG CTA TTT GCC GGC GTC GAC GCG 576 



1J (2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



25 



30 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


TCG 


GGC 


CTC 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAT 


TCT 


CCA 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AAA 


TGT 


TGG 


GTG 


GCG 


GTG 


GCC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAG 


CTC 


CCC 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAT 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGC 


CAA 


CTG 


TTC 


ACC 


TTC 


TCC 


CCC 


312 


AGA 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


351 


TAC 


CCC 


GGC 


CAT 


ATT 


ACG 


GGT 


CAT 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


GCA 


GCG 


CTG 


GTA 


ATG 


429 


GCG 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAG 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG 


GTG 


546 


GTA 


CTG 


TTG 


CTG 


TTT 


ACC 


GGC 


GTC 


GAT 


GCG 








576 
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(2) INFORMATION FOR SEQ ID NO:3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


GCG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTT 


TCT 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACA 


GCG 


CTG 


GTA ATG 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


GTG 


GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 








576 



20 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
25 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:4: 




CAC 


CAA 


GTG 


CGC 


AAC 


TCT 


ACA 


GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAT 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACG 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


ACC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


GTG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CAG 


CTC 


CGA 


CGT 


CAC 


ATC 


GAC 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTC 


CTT 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


TCT 


CCC 


312 


AGG 


CAC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 
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TAT 


CCC 


GGC 


CAT 


ATA ACG 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT ACG 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GCG 








576 



(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 



15 



39 
78 
117 
156 
195 
234 
273 
312 

20 AGG CGC CTC TGG ACG ACG CAA GAC TGC AAT TGT TCT ATC 351 

390 
429 
468 
507 
546 
576 



25 



TAC 


CAA 


GTG 


CGC 


AAC 


TCC 


ACG 


GGG 


CTT 


TAC 


CAT 


GTT 


ACC 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACA GCT 


GAT 


GCT 


ATC 


CTA 


CAC 


GCT 


CCG 


GGA 


TGT 


GTC 


CCT 


TGC 


GTT 


CGT 


GAG 


GGT 


AAC 


ACC 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA 


ACG 


CAG 


CTT 


CGA 


CGT 


TAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAG 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


AGG 


CGC 


CTC 


TGG 


ACG 


ACG 


CAA 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAT 


CGC 


ATG 


GCA 


TGG 


GAT 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


ACG 


GCA 


CTG 


GTA 


GTA 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAT 


ATG 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


TAT 


TTC 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCG 


AAG 


GTC 


CTA 


GTG 


GTG 


CTG 


CTG 


CTA 


TTC 


GCC 


GGC 


GTT 


GAC 


GCG 









(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S18 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



35 



TAC CAA GTA CGC AAC TCC ACG GGC CTT TAC CAT GTC ACC 



39 
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AAT 


GAC 


TGC 


CCT 


AAC 


TCG 


AGC 


- 54 
ATT 


- 

GTG 


TAC 


GAG 


ACG 


GCC 


78 


GAT 


ACC 


ATC 


CTA 


CAC 


TCT 


CCG 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCC 


TCG 


AGA 


TGT 


TGG 


GTG 


CCG 


GTG 


GCC 


156 


CCC 


ACA 


GTT 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTT 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAT 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


AGC 


CAG 


CTG 


TTC 


ACT 


ATC 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGC 


AAC 


TGT 


TCT 


ATC 


351 


TAC 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


ACG 


GCG 


TTG 


GTA 


ATA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


GTC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


CTA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


GCC 


GGC 


GTC 


GAT 


GCG 








576 



10 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
15 (D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 



25 



TAC 


CAA 


GTA 


CGC 


AAC 


TCC 


TCG 


GGC 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCT 


AAC 


TCG 


AGT 


ATT 


GTG 


TAC 


GAG 


ACG 


GCC 


78 


GAT 


GCC 


ATT 


CTA 


CAC 


TCT 


CCA 


GGG 


TGT 


GTC 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GAT 


GGC 


GCC 


CCG 


AAG 


TGT 


TGG 


GTG 


GCG 


GTG 


GCC 


156 


CCC 


ACA 


GTC 


GCC 


ACT 


AGG 


GAC 


GGC 


AAA 


CTC 


CCT 


GCA 


ACG 


195 


CAG 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGA 


AGC 


GCC 


234 


ACC 


CTC 


TGC 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


TTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTC 


GTC 


AGT 


CAA 


CTG 


TTC 


ACG 


TTC 


TCC 


CCC 


312 


AGG 


CGC 


CAC 


TGG 


ACA 


ACG 


CAA 


GAC 


TGT 


AAC 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCC 


ACA 


ACA 


GCG 


CTG 


GTA 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


AGG 


ATC 


CCG 


CAA 


GCC 


GTC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


ATA 


546 


GTG 


CTG 


TTG 


CTG 


TTT 


TCC 


GGC 


GTC 


GAT 


GCG 
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30 

(2) INFORMATION FOR SEQ ID NO:8: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



35 



(i) 



WO 95/01442 



PCT/US94/07320 



- 55 - 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: US11 



5 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO: 8: 




TAC 


V_iiA 






7V TV f» 






GGG 


CTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


UAi 




CCT 


AAL 




Aval 


ATT 


GTG 


TAC 


GAG 


GCG 


GCC 


78 


GAT 


GCC 


ATC 


CTG 


CAC 


ACT 


CCG 


GGG 


TGT 


GTT 


CCT 


TGC 


GTT 


117 


CGC 


GAG 


GGT 


AAC 


GCT 


TCG 


AGG 


TGT 


TGG 


GTG 


GCG 


ATG 


ACC 


156 


CCC 


ACG 


GTG 


GCC 


ACC 


AGG 


GAC 


GGC 


AAA 


CTC 


CCC 


ACA 


ACG 


195 


CAA 


CTT 


CGA 


CGT 


CAC 


ATC 


GAT 


CTG 


CTT 


GTC 


GGG 


AGC 


GCC 


234 


ACC 


CTC 


TGT 


TCG 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGG 


273 


TCT 


GTC 


TTT 


CTT 


GTC 


GGT 


CAA 


CTG 


TTT 


ACC 


TTC 


TCT 


CCC 


312 


A6A 


CGC 


CAC 


TGG 


ACG 


ACG 


CAG 


GGC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


ATA 


ACG 


GGT 


CAC 


CGC 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACG 


GCG 


GCG 


TTG 


GTG 


GTA 


429 


GCT 


CAG 


CTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCC 


ATC 


TTG 


GAC 


ATG 


468 


ATC 


GCT 


GGT 


GCT 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


ATA 


GCG 


507 


TAT 


TTC 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCG 


AAG 


GTC 


CTG 


GTA 


546 


GTG 


CTG 


CTG 


CTA 


TTT 


GCC 


GGC 


GTC 


GAC 


GCG 
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15 (2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
(D } TOPOLOGY : linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Dl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCG 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GAC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GGC 


AAC 


GTC 


CCC 


ACT 


ACG 


195 


GCG 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


CTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACG 


GTA 


CAG 


GAG 


TGT 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTG 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCC 


TTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTT 


GAC 


GGC 
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35 



(2) INFORMATION FOR SEQ ID NO: 10: 



WO 95/01442 



PCT/US94/07320 



- 56 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE : 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAA 


GTC 


ACC 


39 


AAT 


GAC 


TGT 


TCC 


AAC 


TCG 


AGC 


ATC 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT • 


117 


CGG 


GAG 


GAC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


AGC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAT 


CTT 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAA 


TGT 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTG 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTC 


GAC 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 11: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

" (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK1 





(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO: 11: 




TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAC 


GTC 


ACA 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAG 


GCA 


GTG 


78 


GAC 


GTG 


ATC 


ATG 


CAT 


ACC 


CCA 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


CAC 


TCC 


CGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


ATC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


TCC 


GTT 


TTC 


CTC 


GTC 


TCT 


CAG 


CTG 


TTC 


ACC 


TTT 


TCA 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GCA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTT 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCC 


CTA 


GTG 


CTA 


429 



WO 95/01442 



PCT/US94/07320 



- 57 - 



TCG CAG TTA CTC CGA ATC CCA CAA GCT GTC GTG GAC ATG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTC GCC 507 

TAC TAG TCC ATG GCG GGG AAC TGG GCC AAG GTT TTA ATT 546 

GTG TTG CTA CTC TTT GCC GGC GTT GAT GGG 576 



(2) INFORMATION FOR SEQ ID NO: 12: 

5 

(x) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


GTC 


GTG 


TAT 


GAG 


ACA 


GCA 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


GTG 


CCC 


TGC 


GTA 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGT 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GTC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCC 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


CTC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCC 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAA 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



35 



CAT GAA GTG CAC AAC GTA TCC GGG ATC TAC CAT GTC ACG 39 
AAC GAC TGC TCC AAC TCA AGT ATT GTG TAT GAG GCA GCG 78 
GAC ATG ATC ATG CAT ACC CCC GGG TGC GTG CCC TGC GTC 117 



WO 95/01442 



PCT/US94/07320 



5 



pgg 

WWW 


gag 


Ann 


AAP 




TCC 

X WW 


WW J. 


- ^ ft 

TGP 

X WW 


TCG 

X WW 


GTA 

wXX% 


WWW 


pTP 
wX w 


71 PT 

X^W X 


156 


PPP 

www 


21 pg 


Liu 


GPG 


gpp 


AV3U 


nrlw 


gpp 

WWW 


21GP 


ill w 


pnp 

www 


7V p" J ' 

AW X 


aPG 




a.pa 

nLn 


nlii 


ppa 


LAjw 




gtp 


gup 


TTPI 
1 ±\3 


PTP 


PITT 
w 1 X 


rzrzrz 


gpg 


ww X 




PPT 
l?wX 


J. x w 


1 w'w 


X LL 






Tar 1 

Xxlw 




ggzi 


wnX 


PTP 
w X w 


TGP 
X ww 


GG& 


1*71 
Z / J 


1L1 


gtpp 

VjI w 


1 XV— 


Lit 


gtp 


tpp 
X ww 


pivg 


' 1 11 1 YZ 


fnp 

X 1L 




X X w 


TPG 

X WO" 


PPT 
ww X 




pnp 




Lnl 




aPG 


\j ±Jrx 


P2VG 


gap 


TGP 

Xv?w 


aaX 


TGP 
X ww 


X wn 


aTP 

AX w. 




TAT 


ccc 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGA 


CTC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


GCG 


GGA 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCT 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCC 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 14: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

1S (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TTA 


AGC 


ATC 


GTG 


TAC 


GAG 


ACA 


ACG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAA 


AAC 


AAC 


TCC 


TCC 


CGT 


TGT 


TGG 


GTA 


GCG 


CTC 


GCC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTT 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTA 


GCG 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGA 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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30 (2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



WO 95/01442 



PC17US94/07320 



- 59 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK8 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 15: 



5 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATA 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATC 


GTG 


TAT 


GAA 


ACA 


GCG 


78 


GAC 


ATG 


ATT 


ATG 


CAT 


ACC 


CCT 


GGA 


TGC 


ATG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


TGC 


TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCT 


AGG 


AAT 


GTC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTT 


TCG 


CCT 


312 


CGC 


CGA 


CAC 


GAG 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACA 


GCC 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCG 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGC 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTG 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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15 (2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE : 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 



25 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TCC 


TCT 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


TCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CAC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTA 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 17: 



WO 95/01442 



PCT7US94/07320 



- 60 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



TAT 


GAG 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TTC 


TCT 


AGT 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACT 


CTC 


GCG 


GCT 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCG 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCG 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


ATC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTA 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 18: 

20 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: P10 
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(Xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:18: 




TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATA 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACA 


CTC 


GCG 


GCT 


AGG 


AAT 


TCC 


AGC 


GTC 


CCA 


ACT 


ACG 


195 


GCA 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


CTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCA 


CCT 


312 


CGC 


CGG 


CAT 


TGG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


351 


TAT 


CCT 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCA 


GCC 


CTA 


GTG 


GTG 


429 



WO 95/01442 



PCT/US94/07320 



- 61 - 



TCG CAG CTA CTC CGG ATC CCA CAA GCT ATC TTG GAT GTG 468 

GTG GCG GGG GCC CAC TGG GGA GTC CTG GCG GGC CTT GCC 507 

TAC TAT TCC ATG GTG GGG AAC TGG GCT AAG GTC TTG ATT 546 

GTG ATG CTA CTC TTT GCC GGC GTT GAC GGA 576 



(2) INFORMATION FOR SEQ ID NO: 19: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 



15 



TAT 


GAA 


GTG 


CGC 


AAC 


GTA 


TCC 


GGG 


GCG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGT 


ATT 


GTG 


TAC 


GAG 


GCA 


GCG 


78 


GAC 


GTG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGT 


GTA 


CCC 


TGC 


GTT 


117 


CAG 


GAG 


GGT 


AAC 


TCC 


TCC 


CAA 


TGC 


TGG 


GTG 


GCG 


CTC 


ACC 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


ACC 


GTC 


CCC 


ACC 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GTT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTG 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


ATC 


TCG 


CCC 


312 


CGT 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


AAC 


TGC 


AAT 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGA 


CAC 


GTG 


ACA 


GGT 


CAT 


CGC 


ATG 


GCC 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


COT 


ACA 


ACA 


GCC 


CTA 


GTG 


GTA 


429 


TCG 


CAG 


CTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAT 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTC 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTT 


TTT 


GCT 


GGT 


GTT 


GAC 


GGG 
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22 (2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
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TAT GAA GTG CGC AAC GTG TCC GGG GCG TAC CAT GTC ACG 39 
AAC GAC TGC TCC AAC TCA AGC ATT GTG TAT GAG GCA GTG 78 
GAC GTG ATC CTG CAC ACC CCT GGG TGC GTG CCC TGC GTT 117 



WO 95/01442 



PCT/US94/07320 



5 



CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGT 


- 62 
TGC 


_ 

TGG 


GTG 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


TCC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTT 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGT 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


GCA 


GCC 


TTA 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


CTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 21: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AAC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


TCC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCC 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTT 


GTC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


TAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGC 


TCA ATC 


351 


TAT 


CCC 


GGC 


CGC 


GTA 


ACA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


ACA 


GCT 


CTA 


GTA 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


•CGG 


ATC 


CCA 


CAA 


GCT 


ATC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTA 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTT 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAC 


GGG 
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30 (2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



ORIGINAL SOURCE: 



WO 95/01442 



FCT/US94/07320 



- 63 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



5 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAT 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAT 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GCC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTA 


GCA 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GTT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTT 


TCA 


CCT 


312 


CGC 


CGG 


CAC 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTG 


GTG 


GTA 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


GAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTA 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCA 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCT 


GGC 


GTT 


GAC 


GGG 
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15 (2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 



25 



TAC 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


GTG 


TAC 


TAT 


GTC 


ACG 


39 


AAC 


GAC 


TGT 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


ACA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACC 


CCT 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


AGC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTT 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCC 


AGC 


GTC 


CCC 


ACT 


AAG 


195 


ACA 


ATA 


CGA 


CGT 


CAC 


GTC 


GAC 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGT 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGA 


273 


TCT 


GTT 


TTC 


CTC 


GTC 


TCC 


CAG 


CTG 


TTC 


ACT 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


ACA 


GGT 


CAC 


CGT 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACG 


GCA 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTG 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


GTG 


GAC 


ATG 


468 


GTG 


GCG 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GTG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTG 


ATT 


546 


GTG 


CTG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 24: 



WO 95/01442 



PCT/US94/07320 



- 64 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 



TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC 


AAC 


TCA 


AGC 


ATT 


GTG 


TTT 


GAG 


GCA 


GCG 


78 


GAC 


TTG 


ATC 


ATG 


CAC 


ACC 


CCC 


GGG 


TGC 


GTG 


CCC 


TGC 


GTT 


117 


CGG 


GAG 


GGC 


AAC 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


ACC 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACG 


ATA 


CGA 


CGC 


CAT 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


GCT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAT 


GTG 


GGA 


GAC 


CTC 


TGC 


GGA 


273 


TOT 


GTT 


TTC 


CTC 


GTC 


TCT 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGC 


CGG 


CAT 


GAG 


ACT 


TTG 


CAG 


GAC 


TGC 


AAC 


TGC 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAT 


CTG 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCT 


ACA 


ACA 


GCT 


CTA 


GTG 


GTG 


429 


TCG 


CAG 


TTA 


CTC 


CGG 


ATC 


CCA 


CAA 


GCT 


GTC 


ATG 


GAC 


ATG 


468 


GTG 


ACA 


GGG 


GCC 


CAC 


TGG 


GGA 


GTC 


CTG 


GCG 


GGC 


CTT 


GCC 


507 


TAC 


TAT 


TCC 


ATG 


GCG 


GGG 


AAC 


TGG 


GCT 


AAG 


GTT 


TTA 


ATT 


546 


GTG 


ATG 


CTA 


CTC 


TTT 


GCC 


GGC 


GTT 


GAT 


GGG 
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(2) INFORMATION FOR SEQ ID NO: 25: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

a (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US 6 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 
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TAT 


GAA 


GTG 


CGC 


AAC 


GTG 


TCC 


GGG 


ATG 


TAC 


CAT 


GTC 


ACG 


39 


AAC 


GAC 


TGC 


TCC AAC 


TCA 


AGC 


ATT 


GTG 


TAT 


GAG 


GCA 


GCG 


78 


GAC 


ATG 


ATC 


ATG 


CAC 


ACT 


CCC 


GGG 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


CGG 


GAG 


AAC 


AAT 


TCC 


TCC 


CGC 


TGC 


TGG 


GTA 


GCG 


CTC 


ACT 


156 


CCC 


ACG 


CTC 


GCG 


GCC 


AGG 


AAC 


GCT 


AGC 


GTC 


CCC 


ACT 


ACG 


195 


ACA 


ATA 


CGA 


CGC 


CAC 


GTC 


GAT 


TTG 


CTC 


GTT 


GGG 


GCG 


GCT 


234 


ACT 


TTC 


TGC 


TCC 


GCT 


ATG 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGG 


273 


TCC 


GTT 


TTC 


CTC 


ATC 


TCC 


CAG 


CTG 


TTC 


ACC 


TTC 


TCG 


CCT 


312 


CGT 


CAG 


CAT 


GAG 


ACA 


GTA 


CAG 


GAC 


TGC 


AAT 


TGT 


TCA 


ATC 


351 


TAT 


CCC 


GGC 


CAC 


GTA 


TCA 


GGT 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


GCA 


GCC 


CTA 


GTG 


GTA 


429 



WO 95/01442 



FCT/US94/07320 



TCG CAG TTA CTC CGG 
GTG GCG GGG GCC CAC 
TAC TAT TCC ATG GTG 
GTG TTG CTA CTC TTT 



- 65 - 

ATC CCA CAA GOT GTC 
TGG GGA GTC CTG GCG 
GGG AAC TGG GCT AAG 
GCC GGC GTT GAC GGG 



ATG GAC ATG 468 
GGC CTT GCC 507 
GTT CTG ATT 546 

576 



(2) INFORMATION FOR SEQ ID NO: 26: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



15 



GCC 


CAA 


GTG 


AGG 


AAC 


ACC 


AGC 


CGC 


GGT 


TAC 


ATG 


GTG 


ACT 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


GAG 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


CAA 


78 


GCC 


GCG 


GTT 


CTC 


CAC 


GTC 


CCC 


GGG 


TGT 


ATC 


CCG 


TGT 


GAG 


117 


AGG 


CTG 


GGA 


AAT 


ACA 


TCC 


CGA 


TGC 


TGG 


ATA 


CCG 


GTC 


ACA 


156 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCT 


CTT 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCT 


GCC 


CTC 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


273 


GGG 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


TTT 


GTG 


CAA 


GAA 


TGC 


AAT 


TGC 


TCC 


ATC 


351 


TAC 


CCC 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


GCC 


ACC 


ATG 


ATC 


CTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


GGC 


GGG 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTT 


GGC 


TTG 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


ATT 


GTC 


546 


ATC 


CTC 


TTG 


CTG 


GCT 


GCT 


GGG 


GTG 


GAC 


GCG 
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25 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 



35 



GCA 
AAC 
GCC 



CAA GTG AAG AAC ACC ACT AAC AGC TAC ATG GTG ACC 
GAC TGT TCT AAT GAC AGC ATC ACT TGG CAG CTC CAG 
GCG GTC CTC CAC GTC CCC GGG TGT GTC CCG TGC GAG 



39 
78 
117 



WO 95/01442 



PCTAJS94/07320 



5 



AAA ACG 


GGA 


AAT 


ACA 


TCT 


CGG 


- 66 
TGC 


TGG 


ATA 


CCG 


GTT 


TCA 


156 


CCA 


AAC 


GTG 


GCC 


GTG 


CGG 


CAG 


CCC 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATT 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCT 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


CTC 


TGC 


GGC 


273 


GGG 


GTG 


ATG 


CTC 


GCA 


GCC 


CAG 


ATG 


TTC 


ATC 


GTC 


TCG 


CCG 


312 


CAA 


CAT 


CAC 


TGG 


TTT 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATC 


351 


TAC 


CCT 


GGC 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACC 


ATG 


ATC 


CTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


TTA 


GAC 


ATC 


468 


GTT 


AGC 


GGG 


GCA 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


TTG 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


GAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTG 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 28: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 



GCC 


GAA 


GTG 


AAG 


AAC 


ACC 


AGT 


ACC 


AGC 


TAC 


ATG 


GTG 


ACA 


39 


AAT 


GAC 


TGT 


TCC 


AAC 


GAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


CAG 


78 


GCC 


GCG 


GTC 


CTC 


CAC 


GTC 


CCC 


GGG 


TGC 


GTC 


CCG 


TGC 


GAG 


117 


AGA 


GTT 


GGA 


AAC 


GCG 


TCG 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCG 


156 


CCA 


AAC 


GTA 


GCT 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACG 


CAC 


ATC 


GAC 


ATG 


GTT 


GTG 


ATG 


TCC 


GCC 


234 


ACG 


CTC 


TGC 


TCC 


GCT 


CTC 


TAC 


GTG 


GGG 


GAT 


CTC 


TGC 


GGC 


273 


GGG 


GTA 


ATG 


CTC 


GCC 


GCT 


CAG 


ATG 


TTC 


ATT 


ATC 


TCG 


CCG 


312 


CAG 


CAC 


CAC 


TGG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


CCT 


GGT 


ACC 


ATC 


ACT 


GGA 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACA 


ACC 


ACC 


ATG 


ATC 


TTG 


429 


GCG 


TAC 


GCG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATC 


AGC 


GGA 


GCT 


CAC 


TGG 


GGC 


GTC 


ATG 


TTC 


GGC 


CTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAG 


GTC 


GTT 


GTC 


546 


ATC 


CTG 


TTG 


CTC 


ACC 


GCT 


GGC 


GTG 


GAC 


GCG 
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30 (2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

35 (vi) ORIGINAL SOURCE: 



WO 95/01442 



PCT/US94/07320 



- 67 - 



(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: 10 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 



5 



GTC 


CAA 


GTG 


AAA 


AAC 


ACC 


AGT 


ACC 


AGC 


TAT 


ATG 


GTG 


ACC 


39 


AAT 


GAC 


TGC 


TCC 


AAC 


GAC 


AGC 


ATC 


ACT 


TGG 


CAA 


CTT 


GAG 


78 


GCT 


GCG 


GTC 


CTC 


CAC 


GTT 


CCC 


GGG 


TGT 


GTC 


CCG 


TGC 


GAG 


117 


AAA 


GTG 


GGA 


AAT 


ACA 


TCT 


CGG 


TGC 


TGG 


ATA 


CCG 


GTC 


TCA 


156 


CCA 


AAT 


GTG 


GCC 


GTG 


CAG 


CGG 


CCT 


GGC 


GCC 


CTC 


ACG 


CAG 


195 


GGC 


TTG 


CGG 


ACT 


CAC 


ATC 


GAC 


ATG 


GTC 


GTG 


ATG 


TCC 


GCC 


234 


ACQ 


CTC 


TGC 


TCC 


GCT 


CTT 


TAC 


GTG 


GGG 


GAC 


TTC 


TGC 


GGT 


273 


GGG 


ATG 


ATG 


CTC 


GCA 


GCC 


CAA 


ATG 


TTC 


ATT 


GTC 


TCG 


CCG 


312 


CGC 


CAC 


CAC 


TCG 


TTT 


GTG 


CAG 


GAA 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAC 


CCC 


GGT 


ACC 


ATC 


ACC 


GGG 


CAC 


CGT 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACG 


GCC 


ACT 


TTG 


ATC 


CTG 


429 


GCG 


TAC 


GTG 


ATG 


CGC 


GTT 


CCC 


GAG 


GTC 


ATC 


ATA 


GAC 


ATC 


468 


ATT 


AGC 


GGG 


GCG 


CAT 


TGG 


GGC 


GTC 


TTG 


TTC 


GGC 


TTA 


GCC 


507 


TAC 


TTC 


TCT 


ATG 


CAG 


GGA 


GCG 


TGG 


GCG 


AAA 


GTC 


GTT 


GTC 


546 


ATC 


CTT 


CTG 


CTA 


GCC 


GCT 


GGG 


GTG 


GAC 


GCG 
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15 (2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (Vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 



25 



GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCC 


AGC 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


GAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CGC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTT 


ACT 


CAT 


195 


AAC 


CTG 


CGA 


ACA 


CAC 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTA 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


CTC 


ATA 


ATA 


TCG 


CCT 


312 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA 


ACT 


CTT 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAT 


GCC 


GCT 


CGT 


GTT 


CCT 


GAG 


CTA 


GCC 


CTC 


CAG 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAG 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATT 


GCC 


546 


ATC 


CTC 


CTT 


CTT 


GTC 


GCA 


GGA 


GTG 


GAT 


GCA 
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35 



(2) INFORMATION FOR SEQ ID NO: 31: 



WO 95/01442 



PCT/US94/07320 



- 68 - 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

5 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK11 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 



GTG 


GAA 


GTC 


AGG 


AAC 


ACC 


ACT 


TCT 


AGT 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


GCA 


CAT 


ATA 


GAT 


ATG 


ATT 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


GTA 


TCG 


CCA 


312 


GAA 


CAC 


CAC 


CAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAC 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG ATG 


CTT 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAT 


GCC 


GCC 


CGT 


GTT 


CCT 


GAG 


CTA 


GTC 


CTT 


GAA 


GTC 


468 


GTC 


TTC 


GGT 


GGT 


CAT 


TGG 


GGT 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAG 


GGA 


GCG 


TGG 


GCC 


AAG 


GTC 


ATT 


GCC 


546 


ATC 


CTC 


CTT 


CTT 


GTA 


GCA 


GGA 


GTG 


GAT 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 32: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNES S : s ingl e 

(D) TOPOLOGY: linear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 



35 





(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:32: 




GTG 


GAA 


GTC 


AGG 


AAC 


ATC 


AGT 


TCT 


AGC 


TAC 


TAT 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCA 


AAC 


AGC 


AGC 


ATC 


ACC 


TGG 


CAA 


CTC 


ACC 


78 


AAC 


GCA 


GTC 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCG 


TGT 


GAG 


117 


AAT 


GAT 


AAT 


GGC 


ACC 


CTG 


CAC 


TGC 


TGG 


ATA 


CAA 


GTG 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGC 


GGC 


GCG 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


GCA 


CAC 


GTC 


GAT 


ATG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGA 


GAC 


ATG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATC 


GTG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


GAA 


CGC 


CAC 


AAC 


TTT 


ACC 


CAA 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CGT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCG 


TGG 


GAC 


390 


ATG 


ATG 


CTA 


AAC 


TGG 


TCA 


CCA ACT 


CTT 


ACC 


ATG 


ATC 


CTT 


429 



WO 95/01442 



PCT/US94/07320 



- 69 - 



GCC TAT GCC GCT CGT GTT CCT GAG CTA GTC CTT GAA GTT 468 

GTC TTC GGC GGC CAT TGG GGC GTG GTG TTT GGC TTG GCC 507 

TAT TTC TCC ATG CAA GGA GCG TGG GCC AAG GTC ATT GCC 546 

ATC CTC CTG CTT GTC GCA GGA GTG GAT GCA 576 



(2) INFORMATION FOR SEQ ID NO: 33: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D ) TOPOLOGY : linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



15 



GTG 


GAA 


GTT 


AGA 


AAC 


ACC 


AGT 


TTT 


AGC 


TAC 


TAC 


GCC 


ACC 


39 


AAT 


GAT 


TGC 


TCG 


AAC 


AAC 


AGC 


ATC 


ACC 


TGG 


CAG 


CTC 


ACC 


78 


AAC 


GCA 


GTT 


CTC 


CAC 


CTT 


CCC 


GGA 


TGC 


GTC 


CCA 


TGT 


GAG 


117 


AAT 


GAC 


AAT 


GGC 


ACC 


TTG 


CGC 


TGC 


TGG 


ATA 


CAA 


GTA 


ACA 


156 


CCT 


AAT 


GTG 


GCT 


GTG 


AAA 


CAC 


CGT 


GGC 


GCA 


CTC 


ACT 


CAC 


195 


AAC 


CTG 


CGA 


ACG 


CAT 


GTC 


GAC 


GTG 


ATC 


GTA 


ATG 


GCA 


GCT 


234 


ACG 


GTC 


TGC 


TCG 


GCC 


TTG 


TAT 


GTG 


GGG 


GAC 


GTG 


TGC 


GGG 


273 


GCC 


GTG 


ATG 


ATA 


GCG 


TCG 


CAG 


GCT 


TTC 


ATA 


ATA 


TCG 


CCA 


312 


GAA 


CGC 


CAC 


AAC 


TTC 


ACC 


CAG 


GAG 


TGC 


AAC 


TGT 


TCC 


ATC 


351 


TAC 


CAA 


GGT 


CAT 


ATC 


ACC 


GGC 


CAC 


CGC 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


CTG 


AAC 


TGG 


TCA 


CCA 


ACT 


CTC 


ACC 


ATG 


ATC 


CTC 


429 


GCC 


TAC 


GCT 


GCT 


CGT 


GTG 


CCT 


GAA 


CTA 


GTC 


CTT 


GAA 


GTT 


468 


GTC 


TTC 


GGC 


GGC 


CAT 


TGG 


GGC 


GTG 


GTG 


TTT 


GGC 


TTG 


GCC 


507 


TAT 


TTC 


TCC 


ATG 


CAA 


GGA 


GCG 


TGG 


GCC 


AAA 


GTC 


ATC 


GCC 


546 


ATC 


CTC 


CTC 


CTT 


GTC 


GCA 


GGA 


GTG 


GAC 


GCA 
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25 (2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

30 ( vi ) ORIGINAL SOURCE : 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 



35 



GTG GAG GTC AAG GAC ACC GGC GAC TCC TAC ATG CCG ACC 39 
AAC GAT TGC TCC AAC TCT AGT ATC GTT TGG CAG CTT GAA 78 
GGA GCA GTG CTT CAT ACT CCT GGA TGC GTC CCT TGT GAG 117 



WO 95/01442 



PCT/US94/07320 



- 70 - 



5 



CGT 


ACC 


GCC 


AAC 


GTC 


TCT 


CGA 


TGT 


TGG 


GTG 


CCG 


GTT 


GCC 


156 


CCC 


AAT 


CTC 


GCC 


ATA 


AGT 


CAA 


CCT 


GGC 


GCT 


CTC 


ACT 


AAG 


195 


GGC 


CTG 


CGA 


GCA 


CAC 


ATC 


GAT 


ATC 


ATC 


GTG 


ATG 


TCT 


GCT 


234 


ACG 


GTC 


TGT 


TCT 


GCC 


CTT 


TAT 


GTG 


GGG 


GAC 


GTG 


TGT 


GGC 


273 


GCG 


CTG 


ATG 


CTG 


GCC 


GCT 


CAG 


GTC 


GTC 


GTC 


GTG 


TCG 


CCA 


312 


CAA 


CAC 


CAT 


ACG 


TTT 


GTC 


CAG 


GAA 


TGC 


AAC 


TGT 


TCC 


ATA 


351 


TAC 


CCG 


GGC 


CGC 


ATT 


ACG 


GGA 


CAC 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCG 


CCC 


ACT 


ACC 


ACC 


ATG 


CTC 


CTG 


429 


GCG 


TAC 


TTG 


GTG 


CGC ATC 


CCG 


GAA 


GTC 


ATC 


TTG 


GAT 


ATT 


468 


GTT 


ACA 


GGA 


GGT 


CAT 


TGG 


GGT 


GTA 


ATG 


TTT 


GGC 


CTC 


GCT 


507 


TAC 


TTC 


TCC 


ATG 


CAG 


GGA 


TCG 


TGG 


GCG 


AAG 


GTC 


ATC 


GTT 


546 


ATC 


CTC 


CTG 


CTG 


ACT 


GCT 


GGG 


GTG 


GAG 


GCG 
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(2) INFORMATION FOR SEQ ID NO: 35: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

13 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK12 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 



TEA 


GAG 


TGG 


CGG 


AAT 


GTG 


TCC 


GGC 


CTC 


TAC 


GTC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATC 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCT 


ACG 


TGC 


TGG 


ACC 


TCA 


GTG 


ACG 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTG 


CTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


GTG 


TGT 


GGG 


273 


GCC 


GTC 


TTC 


CTT 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACA 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTA 


429 


GCG 


CAC 


GTC 


CTG 


CGT 


CTG 


CCC 


CAG 


ACC 


TTG 


TTC 


GAC 


ATA 


468 


ATA 


GCT 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


ATG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGA 


GTC 


GAT 


GCC 
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30 (2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : 1 inear 

35 (vi) ORIGINAL SOURCE: 



WO 95/01442 



PCT/US94/07320 



- 71 - 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 



5 



CTA 


GAG 


TGG 


CGG 


AAT 


GTG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTT 


ACC 




AAC 


GAC 


TGT 


CCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


TCG 


GTG 


ACA 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCC 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTG 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGC 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTC 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCG 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAC 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCC 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


GTC 


CTG 


CGG 


TTG 


CCC 


CAG 


ACC 


TTG 


TTC 


GAC 


ATA 


468 


ATA 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCA 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 
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15 (2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 



25 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTC 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTT 


ATT 


CTG 


CAC 


ACA 


CCT 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGT 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACA 


GTG 


GCA 


GTC 


AGG 


TAT 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTG 


GTG 


GGC 


GCG 


GCC 


234 


ACT 


ATG 


TGC 


TCT 


GCG 


CTC 


TAC 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGC 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


GTT 


CTG 


CGT 


TTG 


CCC 


CAG 


ACC 


GTG 


TTC 


GAC 


ATA 


468 


ATA GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAC 


TCC 


ATG 


CAA 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATC 


546 


ATC 


ATG 


GTT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAC 


GCC 
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35 



(2) INFORMATION FOR SEQ ID NO:38: 



WO 95/01442 



PCT/US94/07320 



- 72 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S52 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


GTC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


6AC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ATG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACG 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


CTG 


TGC 


TCT 


GCG 


CTC 


TAT 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA 


ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


GTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 


GCG 


CAC 


ATC 


CTG 


CGA 


TTG 


CCC 


CAG 


ACC 


TTG 


TTT 


GAC 


ATA 


468 


CTG 


GCC 


GGG 


GCC 


CAT 


TGG 


GGC 


ATC 


TTG 


GCG 


GGC 


CTA 


GCC 


507 


TAT 


TAT 


TCT 


ATG 


CAG 


GGC 


AAC 


TGG 


GCC 


AAG 


GTC 


GCT 


ATT 


546 


GTC 


ATG 


ATT 


ATG 


TTT 


TCA 


GGG 


GTC 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 39: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S54 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 



35 



CTA 


GAG 


TGG 


CGG 


AAT 


ACG 


TCT 


GGC 


CTC 


TAT 


ATC 


CTT 


ACC 


39 


AAC 


GAC 


TGT 


TCC 


AAT 


AGC 


AGT 


ATT 


GTG 


TAT 


GAG 


GCC 


GAT 


78 


GAC 


GTC 


ATT 


CTG 


CAC 


ACA 


CCC 


GGC 


TGT 


GTA 


CCT 


TGT 


GTT 


117 


CAG 


GAC 


GGC 


AAT 


ACA 


TCC 


ACG 


TGC 


TGG 


ACC 


CCA 


GTG 


ACA 


156 


CCT 


ACG 


GTG 


GCA 


GTC 


AGG 


TAC 


GTC 


GGA 


GCA 


ACC 


ACC 


GCT 


195 


TCG 


ATA 


CGC 


AGT 


CAT 


GTG 


GAC 


CTA 


TTA 


GTG 


GGC 


GCG 


GCC 


234 


ACG 


CTG 


TGC 


TCT 


GCG 


CTC 


TAT 


GTG 


GGT 


GAT 


ATG 


TGT 


GGG 


273 


GCC 


GTC 


TTT 


CTC 


GTG 


GGA 


CAA 


GCC 


TTC 


ACG 


TTC 


AGA 


CCT 


312 


CGT 


CGC 


CAT 


CAA ACG 


GTC 


CAG 


ACC 


TGT 


AAC 


TGC 


TCG 


CTG 


351 


TAC 


CCA 


GGC 


CAT 


CTT 


TCA 


GGA 


CAT 


CGA 


ATG 


GCT 


TGG 


GAT 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCC 


CCC 


GCT 


GTG 


GGT 


ATG 


GTG 


GTG 


429 



WO 95/01442 



FCT/US94/07320 



GCG CAC ATC CTG CGA 
CTG GCC GGG GCC CAT 
TAT TAT TCT ATG CAG 
ATC ATG ATT ATG TTT 



- 73 - 

TTG CCC CAG ACC TTG 
TGG GGC ATC TTG GCG 
GGC AAC TGG GCC AAG 
TCA GGG GTC GAT GCC 



TTT GAC ATA 468 
GGC CTA GCC 507 
GTC GCT ATC 546 
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(2) INFORMATION FOR SEQ ID NO: 40: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



15 



GAG 


CAC 


TAC 


CGG 


AAT 


GCT 


TCG 


GGC 


ATC 


TAT 


CAC 


ATC 


ACC 


39 


AAT 


GAT 


TGT 


CCG 


AAT 


TCC 


AGT 


ATA 


GTC 


TAT 


GAA 


GCT 


GAC 


78 


CAT 


CAC 


ATC 


CTA 


CAC 


TTG 


CCG 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


ATG 


ACT 


GGG 


AAC 


ACA 


TCG 


CGT 


TGC 


TGG 


ACG 


CCG 


GTG 


ACG 


156 


CCT 


ACA 


GTG 


GCT 


GTC 


GCA 


CAC 


CCG 


GGC 


GCT 


CCG 


CTT 


GAG 


195 


TCG 


TTC 


CGG 


CGA 


CAT 


GTG 


GAC 


TTA 


ATG 


GTA 


GGC 


GCG 


GCC 


234 


ACT 


TTG 


TGT 


TCT 


GCC 


CTC 


TAT 


GTT 


GGG 


GAC 


CTC 


TGC 


GGA 


273 


GGT 


GCC 


TTC 


CTG 


ATG 


GGG 


CAG 


ATG 


ATC 


ACT 


TTT 


CGG 


CCG 


312 


CGT 


OGC 


CAC 


TGG 


ACC 


ACG 


CAG 


GAG 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCG 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGC 


CCT 


ACC 


ACC 


ACT 


CTG 


CTC 


CTC 


429 


GCC 


CAG 


ATC 


ATG 


AGG 


GTC 


CCC 


ACA 


GCC 


TTT 


CTC 


GAC 


ATG 


468 


GTT 


GCC 


GGA 


GGC 


CAC 


TGG 


GGC 


GTC 


CTC 


GCG 


GGC 


TTG 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GGC 


AAT 


TGG 


GCC 


AAG 


GTA 


GTC 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TTT 


GCT 


GGG 


GTA 


GAC 


GCC 
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^ (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

30 (vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 

(xi) SEQUENCE DESCRIPTION: SEQ ID N0:41: 



35 



GTG CAC TAC CGG AAT GCT TCG GGC GTC TAT CAT GTC ACC 39 
AAT GAT TGC CCT AAC ACC AGC ATA GTG TAC GAG ACG GAG 78 
CAC CAC ATC ATG CAC TTG CCA GGG TGT GTC CCC TGT GTG 117 



WO 95/01442 



PCT/US94/07320 



5 







SAG 


AAT 


APT 


TCT 


CGC 


TGP 


TGG 


GTG 




TTG 


ACC 


156 


err 


APT 


V7X V7 


GCC 


f?PG 

WVVJ 


CCC 


TAT 


CCC 


AAC 


GCA 


PPG 


TTA 

X X^i 


GAG 


195 


rnr*n 








PAT 


GTA 


GAP 


( W T'G 


ATG 


GTG 


GGT 


GPG 


GPT 


« o *± 


7\ / "I » 

AL1 




lol 


ILL 




THY" 1 
1 1L 


TAP 
xaL 


Al x 




naT 

Vxrlx 


PTG 
LXo 


1\3 1 


GGA 


Z / J 




V3xL. 


■1 "I ■f' 

X 1L 


L-lii. 


tjxVj 


nnp 


LAu 


L ILj 


x J. L 


GAP 


1 XL 


PGA 


PPG 
LLV7 








UiU 




ALL 


ALL 


pan 




IvaL 


aan 

iiiiL 


1 vaL 


TPP 
ILL 


ATP 
nlL 




TAT 


CCT 


GGT 


CAC 


GTC 


TCG 


GGC 


CAC 


AGG 


ATG 


GCC 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGC 


CCT 


ACC 


AGC 


GCG 


CTG 


ATT 


ATG 


429 


GCT 


CAG 


ATC 


TTA 


CGG 


ATC 


CCC 


TCT 


ATC 


CTA 


GGT 


GAC 


TTG 


468 


CTC 


ACC 


GGG 


GGT 


CAC 


TGG 


GGA 


GTT 


CTT 


GCT 


GGT 


CTA 


GCT 


507 


TTC 


TTC 


AGC 


ATG 


CAG 


AGT 


AAC 


TGG 


GCG 


AAG 


GTC 


ATC 


CTG 


546 


GTC 


CTA 


TTC 


CTC 


TTT 


GCC 


GGG 


GTC 


GAG 


GGA 
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(2) INFORMATION FOR SEQ ID NO: 42: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

t (vi) ORIGINAL SOURCE: 

ID (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z6 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 



GTT 


AAC 


TAT 


CGC 


AAT 


GCC 


TCG 


GGC 


GTC 


TAT 


CAC 


GTC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTG 


TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAG 


ATC 


TTA 


CAC 


CTC 


CCA 


GGG 


TGC 


TTG 


CCC 


TGT 


GTG 


117 


AGG 


GTT 


GGG 


AAT 


CAG 


TCA 


CGC 


TGC 


TGG 


GTG 


GCC 


CTT 


ACT 


156 


CCC 


ACC 


GTG 


GCG 


GTG 


TCT 


TAT 


ATC 


GGT 


GCT 


CCG 


CTT 


GAC 


195 


TCC 


CTC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG 


GTG 


GGC 


GCC 


GCT 


234 


ACT 


GTA 


TGC 


TCT 


GCC 


CTC 


TAC 


GTT 


GGA 


GAT 


CTG 


TGC 


GGT 


273 


GGT 


GCA 


TTC 


TTG 


GTT 


GGC 


CAG 


ATG 


TTC 


TCC 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC 


AAT 


TGT 


TCT 


ATC 


351 


TAC 


GCA 


GGG 


CAT 


ATC 


ACG 


GGC 


CAC 


AGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGT 


CCC 


ACA 


ACC 


ACC 


CTG 


CTT 


CTC 


429 


GCC 


CAG 


GTC 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT 


CTG 


GTA 


GAT 


CTA 


468 


CTC 


GCT 


GGA 


GGG 


CAC 


TGG 


GGC 


GTC 


CTT 


GTT 


GGG 


TTG 


GCG 


507 


TAC 


TTC 


AGT 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAA 


GTC 


ATC 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TTC 


GCT 


GGA 


GTT 


GAT 


GCC 
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30 (2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



35 



ORIGINAL SOURCE: 



WO 95/01442 



PCT/US94/07320 



- 75 - 



{A} ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z7 





(Xi) 


SEQUENCE 


DESCRIPTION: SEQ ID 


N0:43: 




GTC 




TAT 


CAC 


AAT 


GCC 


TCG 


GGC 


GTC 


TAT 


CAC 


ATC 


ACC 


39 


AAC 


GAC 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


ATG 


TAT 


GAG 


GCC 


GAA 


78 


CAC 


CAC 


AJ.C 


CIA 


CAC 


CTC 


CCA 


GGG 


TGC 


GTA 


CCC 


TGT 


GTG 


117 


AGG 


GAG 


GGG 


AAC 


CAG 


TCA 


CGC 


TGC 


TGG 


GTG 


GCC 


CTT ACT 


156 


CCC 


ACC 


GTG 


GCG 


GCG 


CCT 


TAT 


ATC 


GGT 


GCA 


CCG 


•CTT 


GAA 


195 


TCC 


ATC 


CGG 


AGA 


CAT 


GTG 


GAC 


CTG 


ATG 


GTA 


GGC 


GCT 


GCT 


234 


ACA 


GTG 


TGC 


TCC 


GCT 


CTC 


TAC 


ATT 


GGG 


GAC 


CTG 


TGC 


GGT 


273 


GGC 


GTA 


TTT 


TTG 


GTT 


GGT 


CAG 


ATG 


TTT 


TCT 


TTC 


CAG 


CCG 


312 


CGA 


CGC 


CAC 


TGG 


ACT 


ACG 


CAG 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAT 


GCG 


GGG 


CAC 


GTT 


ACA 


GGC 


CAC 


AGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


AGT 


CCC 


ACA 


ACC 


ACC 


TTG 


GTC 


CTC 


429 


GCC 


GAG 


GTT 


ATG 


AGG 


ATC 


CCT 


AGC 


ACT 


CTG 


GTG 


GAC 


CTA 


468 


CTC 


ACT 


GGA 


GGG 


CAC 


TGG 


GGT 


ATC 


CTT 


ATC 


GGG 


GTG 


GCA 


507 


TAC 


TTC 


TGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTC 


ATT 


CTG 


546 


GTC 


CTT 


TTC 


CTC 


TAC 


GCT 


GGA 


GTT 


GAT 


GCC 
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15 (2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 



25 



TAC 


AAC 


TAT 


CGC 


AAC 


AGC 


TCG 


GGT 


GTC 


TAC 


CAT 


GTC 


ACC 


39 


AAC 


GAT 


TGC 


CCG 


AAC 


TCG 


AGC 


ATA 


GTC 


TAT 


GAA 


ACC 


GAT 


78 


TAC 


CAC 


ATC 


TTA 


CAC 


CTC 


CCG 


GGA 


TGC 


GTT 


CCT 


TGC 


GTG 


117 


AGG 


GAA 


GGG 


AAC 


AAG 


TCT 


ACA 


TGC 


TGG 


GTG 


TCT 


CTC 


ACC 


156 


CCC 


ACC 


GTG 


GCT 


GCG 


CAA 


CAT 


CTG 


AAT 


GCT 


CCG 


CTT 


GAG 


195 


TCT 


TTG 


AGA 


CGT 


CAC 


GTG 


GAT 


CTG 


ATG 


GTG 


GGC 


GGC 


GCC 


234 


ACT 


CTC 


TGC 


TCC 


GCC 


CTC 


TAC 


ATC 


GGA 


GAC 


GTG 


TGT 


GGG 


273 


GGT 


GTG 


TTC 


TTG 


GTC 


GGT 


CAA 


CTG 


TTC 


ACC 


TTC 


CAA 


CCT 


312 


CGC 


CGC 


CAC 


TGG 


ACC 


ACC 


CAA 


GAC 


TGC 


AAT 


TGT 


TCC 


ATC 


351 


TAC 


ACA 


GGA 


CAT 


ATC 


ACA 


GGA 


CAC 


AGA 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


AGC 


CCC 


ACT 


GCG 


ACG 


CTG 


GTC 


CTC 


429 


GCC 


CAA 


CTT 


ATG 


AGG 


ATC 


CCA 


GGC 


GCC 


ATG 


GTC 


GAC 


CTG 


468 


CTT 


GCA 


GGC 


GGC 


CAC 


TGG 


GGC 


ATT 


CTG 


GTT 


GGC 


ATA 


GCG 


507 


TAC 


TTC 


AGC 


ATG 


CAA 


GCT 


AAT 


TGG 


GCC 


AAG 


GTT 


ATC 


CTG 


546 


GTC 


CTG 


TTT 


CTC 


TTT 


GCT 


GGA 


GTC 


GAC 


GCT 
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35 



(2) INFORMATION FOR SEQ ID NO: 45: 



WO 55/01442 



PCT/US94/07320 



- 76 - 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:45: 



GTT 


CCC 


TAC 


CGG 


AAT 


GCC 


TCT 


GGG 


GTT 


TAC 


CAT 


GTC 


ACC 


39 


AAT 


GAC 


TGC 


CCA 


AAC 


TCC 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


78 


AGC 


CTG 


ATC 


TTG 


CAC 


GCA 


CCT 


GGC 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


CTG 


TCA 


GCC 


CCG 


ACC 


TTC 


GGA GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGA 


GCT 


234 


GCT 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGC 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


CTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACC 


ACA 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


CTG 


ATG 


429 


GCC 


CAG 


ATG 


CTA 


CGG 


ATC 


CCC 


CAG 


GTG 


GTC 


ATA 


GAC 


ATC 


468 


ATA 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTT 


GCC 


GCC 


GCA 


507 


TAC 


TTT 


GCG 


TCG 


GCC 


GCC 


AAC 


TGG 


GCT 


AAG 


GTA 


GTG 


CTG 


546 


GTT 


CTG 


TTC 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 46: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

25 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 



GTT 


CCC 


TAC 


CGA 


AAC 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTT 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATC 


TTG 


CAT 


GCA 


CCT 


GGT 


TGC 


GTG 


CCT 


TGT 


GTC 


117 


AGG 


CAA 


GAT 


AAT 


GTC 


AGT 


AAG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACG 


TTG 


TCA 


GCC 


CCG 


AAT 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAA 


GAC 


TGC 


AAT 


TGC 


TCT 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACG 


GCC 


TTG 


CTG 


ATG 


429 
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GCC CAG TTG CTA CGG ATT CCC CAG GTG GTC ATC GAC ATC 468 

ATT GCC GGG GGC CAC TGG GGG GTC TTG TTT GCC GCC GCA 507 

TAT TTC GCG TCA GCG GCT AAC TGG GCT AAG GTT ATA CTG 546 

GTC TTG TTT CTG TTT GCG GGG GTC GAT GCC 576 



(2) INFORMATION FOR SEQ ID NO: 47: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE: SA5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 



15 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATA 


GTC 


TAC 


GAG 


GCT 


GAT 


78 


AAC 


CTG 


ATT 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AAG 


GAA 


GGT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GTC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCA 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTC 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGC 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


GTG 


CTA 


CGG 


ATT 


CCC 


CAA 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GTC 


GCA 


507 


TAC 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GGC 
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(2) INFORMATION FOR SEQ ID NO: 48: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 576 base pairs 

(B) TOE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 



35 



GTT CCT TAC CGG AAT GCC TCT GGG GTG TAT CAT GTT ACC 39 

AAT GAT TGC CCA AAC TCT TCC ATA GTC TAT GAG GCT GAT 78 

GAC CTG ATC CTA CAC GCA CCT GGC TGC GTG CCC TGT GTC 117 

CGG AAG GAT AAT GTC AGT AGA TGC TGG GTT CAT ATC ACC 156 



WO 95/01442 
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5 



ccc 


ACA 


CTA 


TCA 


GCC 


CCG 


AGC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAT 


TAC 


TTG 


GCG 


GGA 


GGG 


GCC 


234 


GCC 


CTG 


TGC 


TCC 


GCG 


TTA 


TAC 


GTC 


GGA 


GAC 


GTG 


TGC 


GGG 


273 


GCA 


TTG 


TTT 


TTG 


GTA 


GGC 


CAA 


ATG 


TTC 


ACC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAT 


GCT 


ACG 


GTA 


CAG 


GAC 


TGC 


AAC 


TGC 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACT 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCC 


GCG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAA 


ATG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCT 


GCA 


507 


TAC 


TTC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTT 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
15 (C) INDIVIDUAL ISOLATE: SA7 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 



GTC 


CCC 


TAC 


CGA 


AAT 


GCC 


TCC 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 


39 


AAT 


GAT 


TGC 


CCG 


AAC 


TCT 


TCC 


ATA 


GTC 


TAT 


GAG 


GCT 


GAC 


78 


AAC 


CTG 


ATC 


CTG 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTC 


117 


AGA 


CAA 


AAT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAA 


ATC 


ACC 


156 


CCC 


ACA 


TTG 


TCA 


GCC 


CCG 


AAC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


CTA 


GCG 


GGA 


GGG 


GCT 


234 


GCC 


CTC 


TGC 


TCC 


GCG 


CTA 


TAC 


GTC 


GGG 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGC 


CAG 


ATG 


TTC 


AGC 


TAT 


AGG 


CCT 


312 


CGC 


CAG 


CAC 


ACT 


ACG 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAT 


ATC 


ACC 


GGC 


CAC 


CGA 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACG 


ACA 


GCC 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


TTG 


CTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATC 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GGC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


TAT 


TTC 


GCG 


TCA 


GCG 


GCT 


AAC 


TGG 


GCT 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


TTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 








576 



(2) INFORMATION FOR SEQ ID NO: 50: 



(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 576 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



WO 95/01442 
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- 79 - 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 



GTT 


CCC 


TAC 


CGA 


AAT 


GCC 


TCT 


GGG 


GTT 


TAT 


CAT 


GTC 


ACC 




AAT 


GAT 


TGC 


CCA 


AAC 


TCT 


TCC 


ATC 


GTC 


TAC 


GAG 


GCT 


GAT 




GAC 


CTG 


ATC 


TTA 


CAC 


GCA 


CCT 


GGT 


TGC 


GTG 


CCC 


TGT 


GTT 


117 


AGG 


CAG 


GGT 


AAT 


GTC 


AGT 


AGG 


TGC 


TGG 


GTC 


CAG 


ATC 


ACC 


156 


CCC 


ACA 


CTG 


TCA 


GCC 


CCG 


AGC 


CTC 


GGA 


GCG 


GTC 


ACG 


GCT 


195 


CCT 


CTT 


CGG 


AGG 


GCC 


GTT 


GAC 


TAC 


TTA 


GCG 


GGG 


GGG 


GCT 


234 


GCC 


CTT 


TGC 


TCC 


GCG 


TTA 


TAC 


GTC 


GGA 


GAC 


GCG 


TGC 


GGG 


273 


GCA 


GTG 


TTT 


TTG 


GTA 


GGT 


CAA 


ATG 


TTC 


ACC 


TAT 


AGC 


CCT 


312 


CGC 


CGG 


CAT 


AAT 


GTT 


GTG 


CAG 


GAC 


TGC 


AAC 


TGT 


TCC 


ATT 


351 


TAC 


AGT 


GGC 


CAC 


ATC 


ACC 


GGC 


CAC 


CGG 


ATG 


GCA 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAT 


TGG 


TCA 


CCT 


ACA 


ACA 


GCT 


TTG 


GTG 


ATG 


429 


GCC 


CAG 


TTG 


TTA 


CGG 


ATT 


CCC 


CAG 


GTG 


GTC 


ATT 


GAC 


ATC 


468 


ATT 


GCC 


GGG 


GCC 


CAC 


TGG 


GGG 


GTC 


TTG 


TTC 


GCC 


GCC 


GCA 


507 


TAC 


TAC 


GCG 


TCG 


GCG 


GCT 


AAC 


TGG 


GCC 


AAG 


GTT 


GTG 


CTG 


546 


GTC 


CTG 


TTT 


CTG 


TTT 


GCG 


GGG 


GTC 


GAT 


GCC 
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(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

i 5 (A) LENGTH: 576 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK2 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



25 



CTT 


ACC 


TAC 


GGC 


AAC 


TCC 


AGT 


GGG 


CTA 


TAC 


CAT 


CTC 


ACA 


39 


AAT 


GAT 


TGC 


CCC 


AAC 


TCC 


AGC 


ATC 


GTG 


CTG 


GAG 


GCG 


GAT 


78 


GCT 


ATG 


ATC 


TTG 


CAT 


TTG 


CCT 


GGA 


TGC 


TTG 


CCT 


TGT 


GTG 


117 


AGG 


GTC 


GAT 


GAT 


CGG 


TCC 


ACC 


TGT 


TGG 


CAT 


GCT 


GTG 


ACC 


156 


CCC 


ACC 


CTG 


GCC 


ATA 


CCA 


AAT 


GCT 


TCC 


ACG 


CCC 


GCA 


ACG 


195 


GGA 


TTC 


CGC 


AGG 


CAT 


GTG 


GAT 


CTT 


CTT 


GCG 


GGC 


GCC 


GCA 


234 


GTG 


GTT 


TGC 


TCA 


TCC 


CTG 


TAC 


ATC 


GGG 


GAC 


CTG 


TGT 


GGC 


273 


TCT 


CTC 


TTT 


TTG 


GCG 


GGA 


CAA 


CTA 


TTC 


ACC 


TTT 


CAG 


CCC 


312 


CGC 


CGT 


CAT 


TGG 


ACT 


GTG 


CAA 


GAC 


TGC 


AAC 


TGC 


TCC 


ATC 


351 


TAT 


ACA 


GGC 


CAC 


GTC 


ACC 


GGC 


CAC 


AGG 


ATG 


GCT 


TGG 


GAC 


390 


ATG 


ATG 


ATG 


AAC 


TGG 


TCA 


CCC 


ACA 


ACC 


ACT 


CTG 


GTC 


CTA 


429 


TCT 


AGC 


ATC 


TTG 


AGG 


GTA 


CCT 


GAG 


ATT 


TGT 


GCG 


AGT 


GTG 


468 


ATA 


TTT 


GGT 


GGC 


CAT 


TGG 


GGG 


ATA 


CTA 


CTA 


GCC 


GTT 


GCC 


507 


TAC 


TTT 


GGC 


ATG 


GCT 


GGC 


AAC 


TGG 


CTA 


AAA 


GTT 


CTG 


GCT 


546 


GTT 


CTG 


TTC 


CTA 


TTT 


GCA 


GGG 


GTT 


GAA 


GCA 
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(2) INFORMATION FOR SEQ ID NO: 52: 



35 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
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10 



15 



20 



25 



30 



- 80 - 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK7 



<xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID N0:52: 


Tyr Gin 


Val 


Arg 


Asn 


Ser 


Thr 


Gly 


Leu 


Tyr 


His Val Thr Asn Asp 








5 










10 


15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala Asp Ala He Leu 








20 










25 


30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu Gly Asn Val Ser 








35 










40 


45 


Arg Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala Thr Arg Asp Gly 








50 










55 


60 


Lys Leu 


Pro 


Thr 


Ala 


Gin 


Leu 


Arg 


Arg 


His 


He Asp Leu Leu Val 








65 










70 


75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly Asp Leu Cys 








80 










85 


90 


Gly Ser 


val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr Phe Ser Pro Arg 








95 










100 


105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Gly 


Cys 


Asn 


Cys 


Ser He Tyr Pro Gly 








110 










115 


120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 








125 










130 


135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu Leu Arg He Pro 








140 










145 


150 


Gin Ala 


lie 


Leu 


Asp 


Met 


He 


Ala 


Gly 


Ala 


His Trp Gly Val Leu 








155 










160 


165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn Trp Ala Lys Val 








170 










175 


180 


Leu Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Ala 








185 










190 



35 



(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK9 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp 

5 10 15 
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- 81 


- 




Cys Pro 


Asn 


Ser 


Ser 


He 


Val Tyr 


Glu Ala Ala Asp 


Ala He Leu 








20 






25 


30 


His Ser 


Pro 


Gly 


Cys 


Val 


Pro Cys 


Val Arg Glu Gly 


Asn Ala Ser 








35 






40 


45 


Lys Cys 


Trp 


Val 


Ala 


Val 


Ala Pro 


Thr Val Ala Thr 


Arg Asp Gly 








SO 






55 


60 


Lys Leu 


Pro 


Ala 


Thr 


Gin 


Leu Arg 


Arg His He Asp 


Leu Leu Val 








65 






70 


75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser Ala 


Leu Tyr Val Gly 


Asp Leu Cys 








80 






85 


90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly Gin 


Leu Phe Thr Phe 


Ser Pro Arg 








95 






100 


105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp Cys 


Asn Cys Ser lie 


Tyr Pro Gly 


His lie 






110 






115 


120 


Thr 


Gly 


His 


Arg 


Met Ala 


Trp Asp Met Met 


Met Asn Trp 








125 






130 


135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val Met 


Ala Gin Leu Leu 


Arg He Pro 








140 






145 


150 


Gin Ala 


lie 


Leu 


Asp 


Met 


He Ala 


Gly Ala His Trp 


Gly Val Leu 








155 






160 


165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser Met 


Val Gly Asn Trp 


Ala Lys Val 








170 






175 


180 


Val Val 


Val 


Leu 


Leu 


Leu 


Phe Thr 


Gly Val Asp Ala 










185 






190 





(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
20 (B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TO POLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: DR1 



25 



30 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 



His Gin 


Val 


Arg 


Asn 
5 


Ser 


Thr 


Gly 


Leu 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 








20 










25 




30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Ala Ser 








35 










40 




45 


Arg Cys 


Trp 


Val 


Ala 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Thr 


Arg Asp Gly 
60 


Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ser 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 
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5 



Arg His 


Tro 


Thr 


Thr 


Gin 


Asp 








110 






His lie 


Thr 


Glv 


His 


Arcr 


Met 








125 






Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 








140 






Gin Ala 


He 


Leu 


Asp 


Met 


lie 








155 






Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 






170 






Val Val 


Val 


Leu 


Leu 


Leu 


Phe 



185 



82 








Cvs 


Asn 


Cys Ser 


lie Tvx Pro Glv 






115 


120 


Ala 


Tro 


Asp Met 


Met Met Asn Tro 






130 

Jte mJ \J 


X J 3 


Met 


Ala 


Gin Leu 


Leu Arcr He Pro 






145 




Ala 


Gly 


Ala His 


Trp Gly Val Leu 






160 


165 


Met 


Val 


Gly Asn 


Trp Ala Lys Val 






175 


180 


Ala 


Gly 


Val Asp 


Ala 






190 





(2) INFORMATION FOR SEQ ID NO: 55: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DR4 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 



His Gin 


Val 


Arg 


Asn 
5 


Ser 


Thr 


Gly 


Leu 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly 


Asn Thr Ser 
45 


Arg Cys 


Trp 


Val 


Ala 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Thr 


Arg Asp Gly 
60 


Lys Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 






65 










70 






75 


Gly Ser 


Ala 


Thr 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


His His 


Trp 


Thr 


Thr 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Pro Gly 
120 


His He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin Ala 


He 


Leu 


Asp 
155 


Met 


He 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 






170 










175 






180 


Leu Val 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Ala 
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15 



20 



25 



83 - 



(2) INFORMATION FOR SEQ ID NO: 56: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 



Tvr Gin 


V CL.M. 






Cor 
OCX 


14 IX. 


urxy 


JucU 


iyr 


xlXS 


17a 1 


inr 


ASI1 ASp 








c 










1U 








1 c 

lb 


v^y o c x. \j 




Gar 

OC?X 






Val 


iyr 


UlU 


rpV» 

l nz. 


Aj.a 


Asp 


Ala 


11 e Leu 








20 










25 






30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn 


Thr Ser 








35 










40 








45 


Arg Cys 


Trp 


Val 


Ala 


Met 


Thr 


Pro 


Thr 


Val 


Ala 


Thr 


Arg 


Asp Gly 








50 










55 








60 


Lys Leu 


Pro 


Ala 


Thr 


Gin 


Leu 


Arg 


Arg 


Tyr 


He 


Asp 


Leu 


Leu Val 








65 










70 








75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp 


Leu Cys 








80 










85 








90 


61y Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro Arg 








95 










100 








105 


Arg Leu 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro Gly 








110 










115 








120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 








125 










130 








135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ala 


Gin 


Leu 


Leu 


Arg 


He Pro 








140 










145 






150 


Gin Ala 


He 


Leu 


Asp 


Met 


lie 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val Leu 








155 










160 








165 


Ala Gly 


He 


Ala 


Tyr 


Phe 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys Val 








170 










175 








180 


Leu Val 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 












185 








190 









(2) INFORMATION FOR SEQ ID NO: 57: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE : 

» (A) ORGANISM: homosapiens 

33 (C) INDIVIDUAL ISOLATE: SI 8 
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0 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

Tyr Gin Val Arg Asn Ser Thr Gly Leu Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Thr He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser 
5 35 40 45 

Arg Cys Trp Val Pro Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 60 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 

65 70 75 

Gly Ser Ala Thr Leu Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys 

80 85 90 

Gly Ser Val Phe Leu Val Ser Gin Leu Phe Thr He Ser Pro Arg 
10 95 100 105 

Arg His Trp Thr Thr Gin Asp Cys Asn Cys Ser He Tyr Pro Gly 
110 115 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
125 130 135 

Ser Pro Thr Thr Ala Leu Val He Ala Gin Leu Leu Arg Val Pro 
140 145 150 

Gin Ala Val Leu Asp Met He Ala Gly Ala His Trp Gly Val Leu 
155 160 165 

Ala Gly He Ala Tyr Phe Ser Met Ala Gly Asn Trp Ala Lys Val 
170 175 180 

Leu Leu Val Leu Leu Leu Phe Ala Gly Val Asp Ala 
185 190 



15 



20 (2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 

Tyr Gin Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp 

30 5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Thr Ala Asp Ala He Leu 

20 25 30 

His Ser Pro Gly Cys Val Pro Cys Val Arg Glu Asp Gly Ala Pro 

35 40 45 

Lys Cys Trp Val Ala Val Ala Pro Thr Val Ala Thr Arg Asp Gly 

50 55 60 

Lys Leu Pro Ala Thr Gin Leu Arg Arg His He Asp Leu Leu Val 

M 65 70 75 
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Gly Ser 


Ala 


Thr 


Leu 






80 


Gly Ser 


Val 


Phe 


Leu 






95 


Arg His 


Trp 


Thr 


Thr 








110 


His lie 


Thr 


Gly 


His 






125 


Ser Pro 


Thr 


Thr 


Ala 








140 


Gin Ala 


Val 


Leu 


Asp 








155 


Ala Gly 


lie 


Ala 


Tyr 








170 


Leu lie 


Val 


Leu 


Leu 








185 







- 85 






Cys 


Ser 


Ala 


Leu 


Tyr 










85 


Val 


Ser 


Gin 


Leu 


Phe 










100 


Gin 


Asp 


Cys 


Asn 


Cvs 










115 


Arcr 


Met 


Ala 


Trn 


ASD 










130 


Leu 


Val 


Val 


Ala 


Gin 










145 


Met: 


He 


Ala 


Gly 


Ala 










160 


Phe 


Ser 


Met 


Val 


Gly 










175 


Leu 


Phe 


Ser 


Gly 


Val 








190 



Val 


Gly Asp Leu Cys 




90 


Thr 


Phe Ser Pro Arg 




105 


Ser 


He Tyr Pro Gly 




120 


Met 


Met Met Asn Trp 




135 


Leu 


Leu Arg He Pro 




150 


His 


Trp Gly Val Leu 




165 


Asn 


Trp Ala Lys Val 




180 


Asp 


Ala 



(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US11 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 



Tyr 


Gin 


Val 


Arg 


Asn 
5 


Ser 


Thr 


Gly 


Leu 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Ala 


Asp 


Ala He Leu 










20 










25 




30 


His 


Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Gly 


Asn Ala Ser 
45 


Arg 


Cys 


Trp 


Val 


Ala 
50 


Met 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Thr 


Arg Asp Gly 
60 


Lys 


Leu 


Pro 


Thr 


Thr 


Gin 


Leu 


Arg 


Arg 


His 


He 


Asp 


Leu Leu Val 








65 










70 






75 


Gly Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 










80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 










95 










100 






105 


Arg 


His 


Trp 


Thr 


Thr 
110 


Gin 


Gly 


Cys 


Asn 


Cys 
115 


Ser 


lie 


Tyr Pro Gly 
120 


His 


He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser 


Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin 


Ala 


He 


Leu 


Asp 
155 


Met 


He 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 
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- 86 - 



Ala Gly lie Ala Tyr Phe Ser Met Val Gly Asn Trp Ala Lys Val 
170 175 180 

Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
<D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Dl 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 






20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Asp 


Asn Ser Ser 








35 










40 






45 


Arg Cys 


Trp 


val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Gly 






50 










55 






60 


Asn Val 


Pro 


Thr 


Thr 


Ala 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 








70 






75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


He 


Ser 


Gin 


Leu 


Phe 


Thr 


Leu 


Ser Pro Arg 






95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr Pro Gly 






110 










115 






120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg He Pro 








140 










145 






150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly Val Leu 








155 










160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala Lys Val 








170 










175 






180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 










185 










190 









(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNES S : unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: D3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val Tyr 


Gin Val Thr Asn Aan 








5 








10 


1j 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tvr 

J; 


Glu Thr 


71 "1 a 7A ctj Mi&f* Tl o TvTot- 








20 








25 


JU 


His Thr 


Pro 


Gly 


Cvs 


Val 


Pro 


Cys 


Val Arg 


Gill Asn A^n Qpr Car 

"tJJJ AOii Od 








35 








An 




Arg Cys 


Tro 


Val 


Ala 


Leu 


Thr 


Pro 


Thi" Ti^ii 

liil JJCU 


A 1 3 A "1 3 A TTT Aor» Of>-v 

AlCl nlcl i-n. y noil Ocl 
















C cr 
DD 


60 


Ser Val 

OCL VCll 


IT i. VJ 


Th*r 




1I1.L 


lie 


Arg 


HTy rlis 


Val Asp Leu Leu Val 
















/U 


75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val Gly Asp Leu Cys 








80 








85 


90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu Phe 


Thr Phe Ser Pro Arg 








95 








100 


105 


Arg His 


Glu 


Thr 


Val 


Gin 


Glu 


Cys 


Asn Cys 


Ser lie Tyr Pro Gly 








110 








115 


120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met Met Met Asn Trp 








125 








130 


135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser Gin 


Leu Leu Arg lie Pro 








140 








145 


150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly Ala 


His Trp Gly Val Leu 








155 








160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly 


Asn Trp Ala Lys Val 








170 








175 


180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp Gly 








185 








190 



(2) INFORMATION FOR SEQ ID NO: 62: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: DK1 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

~- Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Val Asp Val lie Met 

J3 20 25 30 



WO 95/01442 



PCTAUS94/07320 



10 



15 



His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


- 88 

Cys 


- 

Val 


Arg 
40 


Glu 


Asn 


Asn His Ser 
45 


Arg Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Leu 
55 


Ala 


Ala 


Arg Asn Ala 
60 


Ser lie 


Pro 


Thr 


Thr 


Thr 


lie Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 










70 




75 


Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 


Arg His 


Glu 


Thr 


Ala 


Gin 


Asp Cys 


Asn 


Cys 


Ser 


lie 


Tyr Pro Gly 








110 










115 






120 


His Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Leu 


Ser 


Gin 
145 


Leu 


Leu 


Arg lie Pro 
150 


Gin Ala 


Val 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Ala 


Gly 
175 


Asn 


Trp 


Ala Lys Val 
180 


Leu lie 


Val 


Leu 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp 


Gly 





(2) INFORMATION FOR SEQ ID NO: 63: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 
20 (D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK3 



(Xi) SEQUENCE DESCRIPTION: SEQ 3D NO: 63: 



Tyr Glu 


Val 


Arg 


Asn 
5 


Val 


Ser 


Gly 


lie 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Ser 


Asn 


Ser 


Ser 


Val 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met He Met 








20 










25 




30 


His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Asn 


Asn Ser Ser 
45 


Arg Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Leu 
55 


Ala 


Ala 


Arg Asn Val 
60 


Ser Val 


Pro 


Thr 


Thr 
65 


Thr 


lie 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu Leu Val 
75 


Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 


Arg His 


Glu 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


Leu 


Tyr Pro Gly 
120 
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His Val Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Ala Ala Leu Val Val Ser Gin Leu Leu Arg lie Pro 

140 145 150 

Gin Ala Val Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu lie Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK4 



15 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 



His Glu 


Val 


His 


Asn 
5 


Val 


Ser 


Gly 


He 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Ser 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Ala 


Asp 


Met He Met 
30 


His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Asn 


Asn Ser Ser 
45 


Arg Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Leu 
55 


Ala 


Ala 


Arg Asn Ala 
60 


Ser He 


Pro 


Thr 


Thr 
65 


Thr 


He 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu Leu Val 
75 


Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe 
100 


Thr 


Phe 


Ser Pro Arg 
105 


Arg His 


Glu 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Pro Gly 
120 


His Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg Leu Pro 
150 


Gin Ala 


Val 


Met 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Val 


Gly 
175 


Asn 


Trp 


Ala Lys Val 
180 


Leu He 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 










185 










190 





35 
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(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: HK5 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 



Tyr 61 u 


Val 


Arcr 


As n 


Val 


Ser 


Glv 


Val 


T*VT" 


H-Lci Val Thr 

X1J»£9 VdJL ±HJm 


7\ en A en 








c 

D 










i n 

1U 




1j 




Asn 


Uw KM 


■sex 


Tie 


Val 






Thr 


Thr Asp Met 


XX CS JLICL. 








mi \J 












J u 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu Asn Asn 


Ser Ser 








35 










40 




45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Ala 


Pro 


Thr 


Leu 


Ala Ala Arg 


Asn Ala 








50 










55 




60 


Ser Val 


Pro 


Thr 


Thr 


Ala 


lie 


Arg 


Arg 


His 


Val Asp Leu 


Leu Val 








65 










70 


75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val Gly Asp 


Leu Cys 








80 










85 




90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr Phe Ser 


Pro Arg 








95 










100 




105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser lie Tyr 


Pro Gly 








110 










115 




120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met 


Asn Trp 








125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu Leu Arg 


lie Pro 








140 










145 


150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His Trp Gly 


Val Leu 








155 










160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn Trp Ala 


Lys Val 








170 










175 




180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 










185 










190 





(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS : 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: HK8 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


lie 


Tyr 


His Val Thr Asn Asp 








5 










10 


15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala Asp Met He Met 








20 










25 


30 


His Thr 


Pro 


Gly 


Cys 


Met 


Pro 


Cys 


Val 


Arg 


Glu Asn Asn Ser Ser 








35 










40 


45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala Ala Arg Asn Val 








50 










55 


60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val Asp Leu Leu Val 








65 










70 


75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val Gly Asp Leu Cys 








80 










85 


90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr Phe Ser Pro Arg 








95 










100 


105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser He Tyr Pro Gly 








110 










115 


120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 








125 










130 


135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu Leu Arg He Pro 








140 










145 


150 


Gin Ala 


lie 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His Trp Gly Val Leu 








155 










160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn Trp Ala Lys Val 








170 










175 


180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 








185 










190 



20 (2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: IND5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Tyr Glu Val Arg Asn Val Ser Gly Val Tyr His Val Thr Asn Asp 

30 5 10 15 

Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Met He Met 

20 25 30 
His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ser Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala 

50 55 60 

Ser Val Ser Thr Thr Thr He Arg His His Val Asp Leu Leu Val 

M 65 70 75 
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Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 








80 






Gly Ser 


val 


Phe 


Leu 


Val 


Ser 






95 






Arg His 


Glu 


Thr 


Val 


Gin 


Asp 








110 






His Val 


Ser 


Gly 


His 


Arg 


Met 








125 






Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 








140 






Gin Ala 


Val 


Val 


Asp 


Met 


Val 








155 






Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 








170 






Leu lie 


Val 


Met 


Leu 


Leu 


Phe 



10 185 



92 - 



Ala 


Met 


Tyr 


Val Gly Asp 


Leu Cys 






85 




90 


Gin 


Leu 


Phe 


Thr Phe Ser 


Pro Arg 






100 




105 


Cys 


Asn 


Cys 


Ser lie Tyr 


Pro Gly 






115 




120 


Ala 


Trp 


Asp 


Met Met Met 


Asn Trp 






130 




135 


Val 


Ser 


Gin 


Leu Leu Arg 


lie Pro 






145 


150 


Ala 


Gly 


Ala 


His Trp Gly 


lie Leu 






160 




165 


Met 


Val 


Gly 


Asn Trp Ala 


Lys Val 






175 




180 


Ala 


Gly 


Val 


Asp Gly 





190 



(2) INFORMATION FOR SEQ ID NO: 68: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D ) TOPOLOGY : unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM : homosapiens 
(C) INDIVIDUAL ISOLATE: IND8 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 



(i) 



15 





Tyr Glu 


Val 


Arg 


Asn 

5 


Val 


Ser 


Gly 


Val 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 




Cys Ser 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Ala 


Asp 


Met lie Met 
30 




His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Phe Ser 


25 








35 










40 






45 


Ser Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ala 








50 










55 






60 




Ser Val 


Pro 


Thr 


Thr 
65 


Thr 


lie 


Arg 


Arg 


His 
70 


Val 


Asp 


Leu Leu Val 
75 




Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 




Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 


30 






95 










100 






105 




Arg His 


Glu 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


lie 


Tyr Pro Gly 
120 




His Val 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 




Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


35 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp 


Gly He Leu 








155 










160 






165 



WO 95/01442 



PCT/US94/07320 



- 93 - 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

170 175 180 

Leu lie Val Met Leu Leu Phe Ala Gly Val Asp Gly 
185 190 



(2) INFORMATION FOR SEQ ID NO: 69: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: P10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 



15 



20 



25 



30 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val Tyr 


His Val Thr Asn Asp 








5 








10 


15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu Ala 


Ala Asp Met lie Met 








20 








25 


30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Arg 


Glu Asn Asn Ser Ser 








35 








40 


45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr Leu 


Ala Ala Arg Asn Ser 








50 








55 


60 


Ser Val 


Pro 


Thr 


Thr 


Ala 


lie 


Arg 


Arg His 


Val Asp Leu Leu Val 








65 








70 


75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met Tyr 


Val Gly Asp Leu Cys 








80 








85 


90 


Gly Ser 


Val 


Leu 


Leu 


Val 


Ser 


Gin 


Leu Phe 


Thr Phe Ser Pro Arg 








95 








100 


105 


Arg His 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn Cys 


Ser lie Tyr Pro Gly 








110 








115 


120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met Met Met Asn Trp 








125 








130 


135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser Gin 


Leu Leu Arg lie Pro 








140 








145 


150 


Gin Ala 


lie 


Leu 


Asp 


Val 


Val 


Ala 


Gly Ala 


His Trp Gly Val Leu 








155 








160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val Gly 


Asn Trp Ala Lys Val 








170 








175 


180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly Val 


Asp Gly 








185 








190 



(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B> TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S9 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Ala 


Tyr 


His Val 


Thr Asn Asp 






5 










10 




15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Ala Asp 


Val He Met 






20 








25 




30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Glu Gly 


Asn Ser Ser 






35 










40 




45 


Gin Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala Ala 


Arg Asn Ala 




50 










55 




60 


Thr Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val Asp 


Leu Leu Val 








65 










70 




75 


Gly Ala 


Ala 


Val 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val Gly 


Asp Leu Cys 








80 










85 




90 


Gly Ser 


Val 


Phe 


Leu 


He 


Ser 


Gin 


Leu 


Phe 


Thr He 


Ser Pro Arg 






95 










100 




105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asn 


Cys 


Asn 


Cys 


Ser He 


Tyr Pro Gly 








110 










115 




120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met 


Met Asn Trp 








125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu Leu 


Arg He Pro 








140 










145 




150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His Trp 


Gly Val Leu 








155 










160 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn Trp 


Ala Lys Val 








170 










175 




180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Gly 










185 








190 







(2) INFORMATION FOR SEQ ID NO: 71: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Tyr Glu Val Arg Asn Val Ser Gly Ala Tyr His Val Thr Asn Asp 

5 10 15 

Cys Ser Asn Ser Ser He Val Tyr Glu Ala Val Asp Val He Leu 
35 20 25 30 
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His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu Asn Asn Ser Ser 

45 


Arg Cys 


Trp 


Val 


Ala 
50 


Leu 


Thr 


Pro 


Thr 


Leu 
55 


Ala Ala Arg Asn Ser 

60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


lie Arg 


Arg 


His 


Val Asp Leu Leu Val 








65 










70 


75 


Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val Gly Asp Leu Cys 

90 


Gly Ser 


Val 


Phe 


Leu 
95 


Val 


Ser 


Gin 


Leu 


Phe 
100 


Thr Phe Ser Pro Arg 
105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp Cys 


Asn 


Cys 


Ser lie Tyr Pro Gly 








110 










115 


120 


His Val 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met Met Met Asn Trp 
135 


Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Val 


Ser 


Gin 
145 


Leu Leu Arg lie Pro 
150 


Gin Ala 


Val 


Val 


Asp 
155 


Met 


Val 


Ala 


Gly 


Ala 
160 


His Trp Gly Val Leu 
165 


Ala Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Val 


Gly 
175 


Asn Trp Ala Lys Val 
180 


Leu lie 


Val 


Met 


Leu 
185 


Leu 


Phe 


Ala 


Gly 


Val 
190 


Asp Gly 



1J (2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

20 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homo sapiens 
(C) INDIVIDUAL ISOLATE: SA10 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 



Tyr Glu 


Val 


Arg 


Asn 
5 


Val 


Ser 


Gly 


Met 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Ser 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Ala 


Asp 


Met He Met 
30 


His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Glu 


Asn 


Asn Ser Ser 
45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg Asn Ser 






50 










55 






60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


lie 


Arg 


Arg 


His 


Val 


Asp 


Leu Leu Val 








65 








70 






75 


Gly Ala 


Ala 


Ala 


Phe 
80 


Cys 


Ser 


Ala 


Met 


Tyr 
85 


Val 


Gly 


Asp Leu Cys 
90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser Pro Arg 






95 










100 






105 


Arg Tyr 


Glu 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


lie 


Tyr Pro Gly 
120 
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° Arg Val Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Thr Thr Ala Leu Val Val Ser Gin Leu Leu Arg He Pro 

140 145 150 

Gin Ala He Val Asp Met Val Ala Gly Ala His Trp Gly Val Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Val Gly Asn Trp Ala Lys Val 

c 170 175 180 
Leu He Val Met Leu Leu Phe Ala Gly Val Asp Gly 

185 190 



(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW2 



(xi) 


SEQUENCE DESCRIPTION 


: SEQ ID 


NO:' 


73: 




Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr 


Asn Asp 








5 










10 








15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp 


Met 


He Met 








20 










25 








30 


His Thr 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg 


Glu 


Ala 


Asn 


Ser Ser 








35 










40 








45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala 


Arg 


Asn Thr 








50 










55 








60 


Ser Val 


Pro 


Thr 


Thr 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp 


Leu 


Leu Val 








65 






70 






75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Val 


Met 


Tyr 


Val 


Gly 


Asp 


Leu Cys 








80 










85 








90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe 


Ser 


Pro Arg 








95 










100 








105 


Arg His 


Glu 


Thr 


Val 


Gin Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr 


Pro Gly 








110 










115 








120 


His Val 


Ser 


Gly 


His 


Arg Met 


Ala 


Trp 


Asp 


Met 


Met 


Met 


Asn Trp 








125 










130 








135 


Ser Pro 


Thr 


Ala 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu 


Arg 


He Pro 








140 










145 






150 


Gin Ala 


Val 


Val 


Asp 


Met 


val 


Ala 


Gly 


Ala 


His 


Trp 


Gly 


Val Leu 








155 










160 








165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp 


Ala 


Lys Val 








170 










175 








180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 







185 190 
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(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 



Tyr Glu 


Val 


Arg 


Asn 


Val 


Ser 


Gly 


Val 


Tyr 


Tyr 


Val Thr 


Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Ala 


Asp Met 


He Met 








20 










25 




30 


His Thr 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg 


Glu 


Ser Asn 


Ser Ser 








35 










40 






45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Leu 


Ala 


Ala Arg 


Asn Ala 








50 










55 




60 


Ser Val 


Pro 


Thr 


Lys 


Thr 


He 


Arg 


Arg 


His 


Val 


Asp Leu 


Leu Val 








65 










70 




75 


Gly Ala 


Ala 


Ala 


Phe 


Cys 


Ser 


Ala 


Met 


Tyr 


Val 


Gly Asp 


Leu Cys 








80 










85 






90 


Gly Ser 


Val 


Phe 


Leu 


Val 


Ser 


Gin 


Leu 


Phe 


Thr 


Phe Ser 


Pro Arg 








95 










100 






105 


Arg His 


Glu 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr 


Pro Gly 








110 










115 






120 


His Val 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met 


Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu Arg 


He Pro 








140 










145 




150 


Gin Ala 


Val 


Val 


Asp 


Met 


Val 


Ala 


Gly 


Ala 


His 


Trp Gly 


Val Leu 








155 










160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Val 


Gly 


Asn 


Trp Ala 


Lys Val 








170 










175 






180 


Leu lie 


Val 


Leu 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 





185 190 



(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: T10 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 





Val 

V GL-L 


7A Y"rr 

Aiy 


noil 


Val 


OCX. 


Glv 




iyi 




Val TH t* Aqti Acin 
vax xux nou /u)u 






C 
D 










1U 




13 


Cys ser 


7\ £3 T"l 

ABU 


OCl 


Oof 

OCX 


11c 


val 




urIU 


711 a 
Ala. 


Alex 


ASp licU lie tflcU 






on 
















111 23 1X11 


irlvj 


CI V 
\JT-L jr 




VCLX 






Val 


7A T*<T 

Arg 


UlU 


ftll v Han Qor* Cor 
urijr rian ocx. oci 








jD 










An 




A C 


71 rvr rSrc 
ATy UyS 


irp 


VCLJL 


xild 


JJCll 


J. Ill 


xrJL7(j 


1111 


licU 


AlcL 


Aid Aiy Aoll 1111 


















DO 




cn 


be! vai 


Pro 


inr 


inr 


inr 


lie 


Arg 


Arg 


n't es 


Val 


ASp Lieu LicU Val 








rr 

bb 










/U 




/a 


Caiy AJ.3. 


Aia 


7V*1 9 

Aia 


Jriie 


Cys 


Ser 


Aia 


7dfA+- 

Mel 


Tyr 


vai 


v»iy Asp lieu (jys 








Q ft 










85 




90 


Gly Ser 


vax 


fllS 


Leu 


vax 


Ser 


bin 


Leu 


13 V* a 

-file 


Tiir 


jrne oer Fro Arg 








ft. c 










1 ft ft 

100 




105 


A17y His 




IIIX 


LicU 


I? Ill 


71 en 
A5p 




7A 0 
ASH 


\jys 


Cot* 

OCX 


lie lyiv xriVsJ Vjiy 








110 










115 




120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 








125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Val 


Ser 


Gin 


Leu 


Leu Arg lie Pro 








140 










145 




150 


Gin Ala 


Val 


Met 


Asp 


Met 


Val 


Thr 


Gly 


Ala 


His 


Trp Gly Val Leu 








155 










160 




165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Ala 


Gly 


Asn 


Trp Ala Lys Val 








170 










175 




180 


Leu lie 


Val 


Met 


Leu 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Gly 



185 190 



20 (2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

25 <vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: US 6 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Tyr Glu Val Arg Asn Val Ser Gly Met Tyr His Val Thr Asn Asp 
30 5 10 15 

Cys Ser Asn Ser Ser lie Val Tyr Glu Ala Ala Asp Met lie Met 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Val Arg Glu Asn Asn Ser Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Leu Ala Ala Arg Asn Ala 

50 55 60 

- - Ser Val Pro Thr Thr Thr lie Arg Arg His Val Asp Leu Leu Val 
J5 65 70 75 
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Gly Ala 


Ala 


Thr 


Phe 

a f\ 
oU 


Cys 


Ser 


Gly Ser 


val 


Phe 


Leu 
a c 


lie 


Ser 


Gin His 


Glu 


Thr 


Val 

lift 
111) 


Gin 


Asp 


His Val 


Ser 


Gly 


His 

IOC 


Arg 


Met 


Ser Pro 


Thr 


Ala 


Ala 
140 


Leu 


Val 


Gin Ala 


Val 


Met 


Asp 
155 


Met 


Val 


Ala Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Leu lie 


Val 


Leu 


Leu 
185 


Leu 


Phe 



■ 99 








Ala 


Met 


Tyr 


Val Gly Asp Leu Cys 






85 


90 


Gin 


Leu 


Til— — « 

Phe 


Thr Phe Ser Pro Arg 






100 


105 


Cys 


Asn 


Cys 


Ser He Tyr Pro Gly 






115 


120 


Aia 


Trp 


Asp 


Met Met Met Asn Trp 






130 


135 


vai 


Ser 


Gin 


Leu Leu Arg He Pro 






145 


150 


Ala 


Gly 


Ala 


His Trp Gly Val Leu 






160 


165 


Met 


Val 


Gly 


Asn Trp Ala Lys Val 






175 


180 


Ala 


Gly 


Val 


Asp Gly 






190 





(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
15 {B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T2 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 



Ala Gin 


Val 


Arg 


Asn 


Thr 


Ser 


Arg 


Gly 


Tyr 


Met 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Glu 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala Val Leu 








20 










25 






30 


His Val 


Pro 


Gly 


Cys 


He 


Pro 


Cys 


Glu 


Arg 


Leu 


Gly 


Asn Thr Ser 








35 










40 






45 


Arg Cys 


Trp 


He 


Pro 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Arg Gin Pro 








50 










55 






60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp 


Met Val Val 








65 










70 






75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val 


Ser Pro Arg 








95 










100 






105 


Arg His 


Trp 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 








110 










115 






120 


Thr He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Ala 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Met 


Arg Val Pro 








140 










145 






150 


Glu Val 


He 


He 


Asp 


He 


He 


Gly 


Gly 


Ala 


His 


Trp 


Gly Val Met 



155 160 165 
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° Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 

170 175 180 

He Val He Leu Leu Leu Ala Ala Gly Val Asp Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 78: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: T4 



15 



20 



25 



30 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO: 78: 


Ala Gin 


Val 


Lys 


Asn 


Thr 


Thr 


Asn 


Ser 


Tyr 


Met 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Asp 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Gin 


Ala 


Ala Val Leu 






20 








25 






30 


His Val 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Lys 


Thr 


Gly 


Asn Thr Ser 








35 










40 






45 


Arg Cys 


Trp 


He 


Pro 


Val 


Ser 


Pro 


Asn 


Val 


Ala 


Val 


Arg Gin Pro 








50 










55 






60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His 


He 


Asp 


Met Val Val 






65 










70 






75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 








80 










85 






90 


Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He 


Val 


Ser Pro Gin 






95 










100 






105 


His His 


Trp 


Phe 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Pro Gly 








110 










115 






120 


Thr lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Ala 


Thr 


Met 


lie 


Leu 


Ala 


Tyr 


Ala 


Met 


Arg Val Pro 








140 










145 






150 


Glu Val 


lie 


Leu 


Asp 


He 


Val 


Ser 


Gly 


Ala 


His 


Trp 


Gly Val Met 








155 










160 






165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp 


Ala Lys Val 








170 










175 






180 


Val Val 


lie 


Leu 


Leu 


Leu 


Ala 


Ala 


Gly 


Val 


Asp 


Ala 










185 










190 









35 



(2) INFORMATION FOR SEQ ID NO: 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T9 



<xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID NO:79: 


Ala Glu 


Val 


Lys 


As n 


Thr 


Ser 


Thr 


Ser 


Tyr 


Met Val Thr Asn Asp 








5 










10 


15 


Cys Ser 


Asn 


Asp 


Ser 


lie 


Thr 


Trp 


Gin 


Leu 


Gin Ala Ala Val Leu 








20 










25 


30 


His Val 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Arg 


Val Gly Asn Ala Ser 








35 










40 


45 


Arg Cys 


Tip 


He 


Pro 


Val 


Ser 


Pro 


Asn 


Val 


Ala Val Gin Arg Pro 








50 










55 


60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arcr 


Thr 


His 


He Asp Met Val Val 








65 










70 


75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly Asp Leu Cys 








80 










85 


90 


Gly Gly 


Val 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe 


He He Ser Pro Gin 








95 










100 


105 


His His 


Trp 


Phe 


Val 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser lie Tyr Pro Gly 








110 










115 


120 


Thr lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met Asn Trp 








125 










130 


135 


Ser Pro 


Thr 


Thr 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala Met Arg Val Pro 








140 










145 


150 


Glu Val 


He 


He 


Asp 


He 


He 


Ser 


Gly 


Ala 


His Trp Gly Val Met 








155 










160 


165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala Trp Ala Lys Val 








170 










175 


180 


Val Val 


lie 


Leu 


Leu 


Leu 


Thr 


Ala 


Gly 


Val 


Asp Ala 








185 








190 



(2) INFORMATION FOR SEQ ID NO: 80: 

25 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(Vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: US10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Val Gin Val Lys Asn Thr Ser Thr Ser Tyr Met Val Thr Asn Asp 

5 10 15 

_ Cys Ser Asn Asp Ser lie Thr Trp Gin Leu Glu Ala Ala Val Leu 

20 25 30 
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Hxs val 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


GlU 


Lys val Gly 


Asn Thr Ser 








35 










40 


45 


Arg Cys 


Trp 


lie 


Pro 


Val 


Ser 


Pro 


Asn 


Val Ala Val 


Gin Arg Pro 








50 










55 


60 


Gly Ala 


Leu 


Thr 


Gin 


Gly 


Leu 


Arg 


Thr 


His He Asp 


Met Val Val 








65 










70 


75 


Met Ser 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr Val Gly 


Asp Phe Cys 








80 










85 


90 


Gly Gly 


Met 


Met 


Leu 


Ala 


Ala 


Gin 


Met 


Phe He Val 


Ser Pro Arg 








95 










100 


105 


HIS HIS 


Ser 


Phe 


Val 


Gin 


GlU 


Cys 


Asn 


Cys Ser He 


Tyr Pro Gly 








110 










115 


120 


Thr lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp Met Met 


Met Asn Trp 








125 










130 


135 


Ser Pro 


Thr 


Ala 


Thr 


Leu 


He 


Leu 


Ala 


Tyr Val Met 


Arg Val Pro 








140 










145 


150 


Glu Val 


He 


He 


Asp 


He 


He 


Ser 


Gly 


Ala His Trp 


Gly Val Leu 








155 










160 


165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly Ala Trp 


Ala Lys Val 








170 










175 


180 


Val Val 


lie 


Leu 


Leu 


Leu 


Ala 


Ala 


Gly 


Val Asp Ala 










185 








190 





15 

(2) INFORMATION FOR SEQ ID NO: 81: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
20 (D) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK8 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 



Val Glu 


Val 


Arg 


Asn 


He 


Ser 


Ser 


Ser 


Tyr 


Tyr 


Ala 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Asn 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Thr 


Asp 


Ala Val Leu 








20 










25 






30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Asn 


Asp 


Asn 


Gly Thr Leu 








35 










40 






45 


Arg Cys 


Trp 


He 


Gin 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Lys His Arg 








50 










55 






60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Thr 


His 


Val 


Asp 


Val He Val 








65 










70 






75 


Met Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 








80 










85 






90 


Gly Ala 


Val 


Met 


He 


Val 


Ser 


Gin 


Ala 


Leu 


He 


He 


Ser Pro Glu 








95 










100 






105 


Arg His 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Gin Gly 








110 










115 






120 
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His He Thr Gly His Arg Met Ala Trp Asp Met Met Leu Asn Trp 

125 130 135 

Ser Pro Thr Leu Thr Met He Leu Ala Tyr Ala Ala Arg Val Pro 

140 145 150 

Glu Leu Ala Leu Gin Val Val Phe Gly Gly His Trp Gly Val Val 

155 160 165 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 
5 170 175 180 

He Ala He Leu Leu Leu Val Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 



15 



20 



25 



30 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK11 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID 


NO:J 


B2: 


Val Glu 


Val 


Arg 


Asn 


Thr 


Ser 


Ser 


Ser 


Tyr 


Tyr 


Ala 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Asn 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Thr 


Asn 


Ala Val Leu 








20 










25 






30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Glu 


Asn 


Asp 


Asn 


Gly Thr Leu 








35 










40 






45 


His Cys 


Trp 


He 


Gin 


Val 


Thr 


Pro 


Asn 


Val 


Ala 


Val 


Lys His Arg 








50 










55 






60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Ala 


His 


lie 


Asp 


Met He Val 








65 










70 






75 


Met Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 








80 










85 






90 


Gly Ala 


Val 


Met 


He 


Val 


Ser 


Gin 


Ala 


Phe 


He 


Val 


Ser Pro Glu 








95 










100 






105 


His His 


His 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


lie 


Tyr Gin Gly 








110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Leu Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Ala 


Arg Val Pro 








140 










145 






150 


Glu Leu 


Val 


Leu 


Glu 


Val 


Val 


Phe 


Gly 


Gly 


His 


Trp 


Gly Val Val 








155 










160 






165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp 


Ala Lys Val 








170 










175 






180 


He Ala 


He 


Leu 


Leu 


Leu 


Val 


Ala 


Gly 


Val 


Asp 


Ala 










185 








190 







35 
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(2) INFORMATION FOR SEQ ID NO: 83: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SW3 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 



vai cixu 


vai 


Arg 


Asn 


lie 


Ser 


Ser 


Ser 


Tyr 


Tyr 


Ala Thr 


Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


He 


Thr 


Trp 


Gin 


Leu 


Thr 


Asn Ala 


Vai Leu 








20 








25 






30 


His Leu 


Pro 


Gly 


Cys 


Vai 


Pro 


Cys 


Glu 


Asn 


Asp 


Asn Gly 


Thr Leu 








35 










40 




45 


His Cys 


Trp 


He 


Gin 


Vai 


Thr 


Pro 


Asn 


Vai 


Ala 


Vai Lys 


His Arg 








50 










55 






60 


Gly Ala 


Leu 


Thr 


His 


Asn 


Leu 


Arg 


Ala 


His 


Vai 


Asp Met 


He Vai 








65 










70 






75 


Met Ala 


Ala 


Thr 


Vai 


Cys 


Ser 


Ala 


Leu 


Tyr 


Vai 


Gly Asp 


Met Cys 








80 










85 






90 


Gly Ala 


Vai 


Met 


lie 


Vai 


Ser 


Gin 


Ala 


Phe 


He 


He Ser 


Pro Glu 








95 










100 






105 


Arg His 


Asn 


Phe 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He Tyr 


Gin Gly 








110 










115 






120 


Arg lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Leu 


Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Leu 


Thr 


Met 


He 


Leu 


Ala 


Tyr 


Ala 


Ala Arg 


Vai Pro 








140 










145 






150 


Glu Leu 


Vai 


Leu 


Glu 


Vai 


Vai 


Phe 


Gly 


Gly 


His 


Trp Gly 


Vai Vai 








155 










160 






165 


Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Ala 


Trp Ala 


Lys Vai 








170 










175 






180 


He Ala 


He 


Leu 


Leu 


Leu 


Vai 


Ala 


Gly 


Vai 


Asp 


Ala 










185 








190 







(2) INFORMATION FOR . SEQ ID NO: 84: 



(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: T8 
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° (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Val Glu Val Arg Asn Thr Ser Phe Ser Tyr Tyr Ala Thr Asn Asp 

5 10 15 

Cys Ser Asn Asn Ser lie Thr Trp Gin Leu Thr Asn Ala Val Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Glu Asn Asp Asn Gly Thr Leu 
5 35 40 45 

Arg Cys Trp lie Gin Val Thr Pro Asn Val Ala Val Lys His Arg 

50 55 60 

Gly Ala Leu Thr His Asn Leu Arg Thr His Val Asp Val lie Val 

65 70 75 

Met Ala Ala Thr Val Cys Ser Ala Leu Tyr Val Gly Asp Val Cys 

80 85 90 

Gly Ala Val Met lie Ala Ser Gin Ala Phe He lie Ser Pro Glu 
10 95 100 105 

Arg His Asn Phe Thr Gin Glu Cys Asn Cys Ser He Tyr Gin Gly 
110 115 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Leu Asn Trp 
110 115 120 

Ser Pro Thr Leu Thr Met He Leu Ala Tyr Ala Ala Arg Val Pro 
125 130 135 

Glu Leu Val Leu Glu Val Val Phe Gly Gly His Trp Gly Val Val 
140 145 150 

Phe Gly Leu Ala Tyr Phe Ser Met Gin Gly Ala Trp Ala Lys Val 
155 160 165 

He Ala He Leu Leu Leu Val Ala Gly Val Asp Ala 
170 175 



15 



20 (2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

25 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S83 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Val Glu Val Lys Asp Thr Gly Asp Ser Tyr Met Pro Thr Asn Asp 

30 5 10 15 

Cys Ser Asn Ser Ser He Val Trp Gin Leu Glu Gly Ala Val Leu 

20 25 30 

His Thr Pro Gly Cys Val Pro Cys Glu Arg Thr Ala Asn Val Ser 

35 40 45 

Arg Cys Trp Val Pro Val Ala Pro Asn Leu Ala He Ser Gin Pro 

50 55 60 

Gly Ala Leu Thr Lys Gly Leu Arg Ala His He Asp He He Val 

^ 65 70 75 
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L'iCL OCX 


Ala 

nXCL 


Th-r 

XLiX 


VAX 




QOT" 








o u 






vjxy Ala 






UC Li. 


A1 a 


Al a 






q c; 






EL J- is Ills 


XLXL. 




VolX 


V7XXI 


VjX IX 








Tin 
XX u 






Arg lie 


inr 




nxS 


Arg 


Met- 








125 






Ser Pro 


Thr 


Thr 


Thr 


Met 


Leu 








140 






Glu Val 


lie 


Leu 


Asp 


lie 


Val 








155 






Phe Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 








170 






He Val 


He 


Leu 


Leu 


Leu 


Thr 



10 185 



106 - 



Ala 


Leu 


Tvr 


Val Glv Asn 


Val Pva 






O -3 




on 


Gin 


Val 


Val 


Val Val Ser 

•ax vax 


Prn m n 

* X \J wXlX 






i nn 




XUD 




A cm 
noil 


fHrQ 


Cot* Tl o Tvr 
wcx xxc i y l 


Prn ^11 xr 

JtrXVJ V?Xjr 






11C 
113 




ion 
X^U 


Al a 

nXCL 


irp 




Mo r- Mot* Moh 
incL net HcC 


Asn ixp 






i in 

1J u 




XJD 


Leu 


Ala 


Tyr 


Leu Val Arg 


He Pro 






145 




150 


Thr 


Gly 


Gly 


His Trp Gly 


Val Met 






160 




165 


Met 


Gin 


Gly 


Ser Trp Ala 


Lys Val 






175 




180 


Ala 


Gly 


Val 


Glu Ala 








190 







(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 
1S (B) TYPE: amino acid 

13 (C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK12 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 



Leu Glu 


Trp 


Arg 


Asn 
5 


Val 


Ser 


Gly 


Leu 


Tyr 
10 


Val 


Leu 


Thr Asn Asp 
15 


Cys Ser 


Asn 


Ser 


Ser 
20 


lie 


Val 


Tyr 


Glu 


Ala 
25 


Asp 


Asp 


Val He Leu 
30 


His Thr 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Gin 
40 


Asp 


Gly 


Asn Thr Ser 
45 


Thr Cys 


Trp 


Thr 


Ser 
50 


Val 


Thr 


Pro 


Thr 


Val 
55 


Ala 


Val 


Arg Tyr Val 
60 


Gly Ala 


Thr 


Thr 


Ala 
65 


Ser 


He 


Arg 


Ser 


His 
70 


Val 


Asp 


Leu Leu Val 
75 


Gly Ala 


Ala 


Thr 


Met 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Val Cys 
90 


Gly Ala 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Ala 


Phe 
100 


Thr 


Phe 


Arg Pro Arg 
105 


Arg His 


Gin 


Thr 


Val 
110 


Gin 


Thr 


Cys 


Asn 


Cys 
115 


Ser 


Leu 


Tyr Pro Gly 
120 


His Leu 


Ser 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Ala 


Val 


Gly 
140 


Met 


Val 


Val 


Ala 


His 
145 


Val 


Leu 


Arg Leu Pro 
150 


Gin Thr 


Leu 


Phe 


Asp 
155 


lie 


He 


Ala 


Gly 


Ala 
160 


His 


Trp 


Gly He Met 
165 
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Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 
170 175 180 

Ala He He Met Val Met Phe Ser Gly Val Asp Ala 
185 190 



(2) INFORMATION FOR SEQ ID NO: 87: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

10 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 





Leu 


Glu 


Trp 


Arg 


Asn 
5 


Val 


Ser 


Gly 


Leu 


Tyr 
10 


Val Leu Thr 


Asn Asp 
15 


15 


Cys 


Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp Asp Val 


He Leu 










20 










25 




30 




His 


Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp Gly Asn 


Thr Ser 












35 










40 


45 




Thr 


Cys 


Trp 


Thr 


Ser 


Val 


Thr 


Pro 


Thr 


Val 


Ala Val Arg 


Tyr Val 












50 










55 


60 




Gly 


Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser 


His 


Val Asp Leu 


Leu Val 












65 










70 


75 


20 


Gly 


Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly Asp 


Met Cys 












80 










85 


90 




Gly 


Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr Phe Arg 


Pro Arg 












95 










100 


105 




Arg 


His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser Leu Tyr 


Pro Gly 












110 










115 


120 




His 


Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met Met 


Asn Trp 


25 










125 










130 




135 


Ser 


Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala 


His 


Val Leu Arg 


Leu Pro 












140 










145 


150 




Gin 


Thr 


Leu 


Phe 


Asp 


He 


He 


Ala 


Gly 


Ala 


His Trp Gly 


He Leu 












155 










160 


165 




Ala 


Gly 


Leu 


Ala 


Tyr 
170 


Tyr 


Ser 


Met 


Gin 


Gly 
175 


Asn Trp Ala 


Lys Val 
180 




Ala 


He 


lie 


Met 


Val 


Met 


Phe 


Ser 


Gly 


Val 


Asp Ala 




30 










185 








190 





(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
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(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 



Leu Glu 


Trp 


Arg 


Asa 


Thr 


Ser 


Gly 


Leu Tyr 


Val Leu Thr Asn Asp 








5 








10 


15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu Ala 


Asp Asp Val He Leu 








20 








25 


30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val Gin 


Asp Gly Asn Thr Ser 








35 








40 


45 


Thr Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr Val 


Ala Val Arg Tyr Val 








50 








55 


60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


He 


Arg 


Ser His 


Val Asp Leu Leu Val 






65 






70 


75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Leu Tyr 


Val Gly Asp Met Cys 








80 








85 


90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala Phe 


Thr Phe Arg Pro* Arg 








95 








100 


105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn Cys 


Ser Leu Tyr Pro Gly 








110 








115 


120 


His Leu 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp Asp 


Met Met Met Asn Trp 








125 








130 


135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala His 


Val Leu Arg Leu Pro 








140 








145 


150 


Gin Thr 


Val 


Phe 


Asp 


He 


lie 


Ala 


Gly Ala 


His Trp Gly He Leu 








155 








160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin Gly 


Asn Trp Ala Lys Val 








170 








175 


180 


Ala He 


He 


Met 


Val 


Met 


Phe 


Ser 


Gly Val 


Asp Ala 



185 190 



(2) INFORMATION FOR SEQ ID NO: 89: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

30 (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: S52 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Leu Glu Trp Arg Asn Thr Ser Gly Leu Tyr Val Leu Thr Asn Asp 

5 10 15 

„ Cys Ser Asn Ser Ser He Val Tyr Glu Ala Asp Asp Val He Leu 

20 25 30 



WO 95/01442 



PCT/DS94/07320 



- 109 - 



His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin Aso Glv 


Asn Thr Ser 








35 










40 




Met Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val Ala Val 


Arcr Tvx Val 








50 










55 


fin 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg 


Ser 


His Val Asp 


Leu Leu Val 








65 










70 


/ j 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tvr Val Glv 


Asn Met fVq 








80 










85 




Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe Thr Phe 


Ara Pro Arcr 








95 










100 


105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys Ser Leu 


Tyr Pro Gly 








110 










115 


120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp Met Met 


Met Asn Trp 








125 










130 


135 


Ser Pro 


Ala 


Val 


Gly 


Met 


Val 


Val 


Ala 


His He Leu 


Arg Leu Pro 








140 










145 


ISO 


Gin Thr 


Leu 


Phe 


Asp 


He 


Leu 


Ala 


Gly 


Ala His Trp 


Gly lie Leu 








155 










160 


165 


Ala Gly 


Leu 


Ala 


Tyr 


Tyr 


Ser 


Met 


Gin 


Gly Asn Trp 


Ala Lys Val 








170 










175 


180 


Ala lie 


Val 


Met 


lie 


Met 


Phe 


Ser 


Gly 


Val Asp Ala 





185 190 
15 

(2) INFORMATION FOR SEQ ID NO: 90: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
20 (D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: S54 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 



35 



Leu Glu 


Trp 


Arg 


Asn 


Thr 


Ser 


Gly 


Leu 


Tyr 


He 


Leu 


Thr Asn Asp 








5 










10 






15 


Cys Ser 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Val He Leu 








20 










25 






30 


His Thr 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Gin 


Asp 


Gly 


Asn Thr Ser 








35 










40 






45 


Thr Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Arg Tyr Val 








50 










55 






60 


Gly Ala 


Thr 


Thr 


Ala 


Ser 


lie 


Arg 


Ser 


His 


Val 


Asp 


Leu Leu Val 








65 










70 






75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Met Cys 








80 










85 






90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Ala 


Phe 


Thr 


Phe 


Arg Pro Arg 








95 










100 






105 


Arg His 


Gin 


Thr 


Val 


Gin 


Thr 


Cys 


Asn 


Cys 


Ser 


Leu 


Tyr Pro Gly 








110 










115 






120 
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His Leu Ser Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 

125 130 135 

Ser Pro Ala Val Gly Met Val Val Ala His lie Leu Arg Leu Pro 

140 145 150 

Gin Thr Leu Phe Asp lie Leu Ala Gly Ala His Trp Gly He Leu 

155 160 165 

Ala Gly Leu Ala Tyr Tyr Ser Met Gin Gly Asn Trp Ala Lys Val 

170 175 180 

Ala He He Met He Met Phe Ser Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

10 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Z4 



20 



25 



30 



(xi) 


SEQUENCE 


DESCRIPTION: 


: SEQ ID 


NO:91: 


Glu His 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


He 


Tyr 


His 


He 


Thr Asn Asp 






5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


His 


His He Leu 






20 










25 






30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Met 


Thr 


Gly 


Asn Thr Ser 






35 










40 






45 


Arg Cys 


Trp 


Thr 


Pro 


Val 


Thr 


Pro 


Thr 


Val 


Ala 


Val 


Ala His Pro 




50 










55 






60 


Gly Ala 


Pro 


Leu 


Glu 


Ser 


Phe 


Arg 


Arg 


His 


Val 


Asp 


Leu Met Val 






65 










70 






75 


Gly Ala 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Leu Cys 






80 










85 






90 


Gly Gly 


Ala 


Phe 


Leu 


Met 


Gly 


Gin 


Met 


He 


Thr 


Phe 


Arg Pro Arg 






95 










100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Glu 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Thr Gly 






110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 






125 










130 






135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala 


Gin 


He 


Met 


Arg Val Pro 








140 










145 






150 


Thr Ala 


Phe 


Leu 


Asp 


Met 


Val 


Ala 


Gly 


Gly 


His 


Trp 


Gly Val Leu 








155 










160 






165 


Ala Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Gly 


Asn 


Trp 


Ala Lys Val 






170 










175 






180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 










185 










190 









35 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: Zl 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 



Val His 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr His Val Thr 


Asn Asp 








5 










10 


15 


Cys Pro 


Asn 


Thr 


Ser 


lie 


Val 


Tyr 


Glu 


Thr Glu His His 


lie Met 








20 










25 


30 


His Leu 


Pro 


Gly 


Cys 


val 


Pro 


Cys 


Val 


Arg Thr Glu Asn 


Thr Ser 








35 










40 


45 


Arg Cys 


Trp 


Val 


Pro 


Leu 


Thr 


Pro 


Thr 


Val Ala Ala Pro 


Tyr Pro 








50 










55 


60 


Asn Ala 


Pro 


Leu 


Glu 


Ser 


Met 


Arg 


Arg 


His Val Asp Leu 


Met Val 








65 










70 


75 


Gly Ala 


Ala 


Thr 


Met 


Cys 


Ser 


Ala 


Phe 


Tyr lie Gly Asp 


Leu Cys 








80 










85 


90 


Gly Gly 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe Asp Phe Arg 


Pro Arg 








95 










100 


105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys Ser lie Tyr 


Pro Gly 








110 










115 


120 


His Val 


Ser 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp Met Met Met 


Asn Trp 








125 










130 


135 


Ser Pro 


Thr 


Ser 


Ala 


Leu 


lie 


Met 


Ala 


Gin lie Leu Arg 


lie Pro 








140 










145 


150 


Ser lie 


Leu 


Gly 


Asp 


Leu 


Leu 


Thr 


Gly 


Gly His Trp Gly 


Val Leu 








155 










160 


165 


Ala Gly 


Leu 


Ala 


Phe 


Phe 


Ser 


Met 


Gin 


Ser Asn Trp Ala 


Lys Val 








170 










175 


180 


lie Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val Glu Gly 





185 190 



(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
35 (C) INDIVIDUAL ISOLATE: Z6 
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112 



10 



15 



(xi) 


SEQUENCE 


DESCRIPTION 


: SEQ ID NO:93: 


Val As n 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His Val 


Thr Asn Asp 








5 










10 




15 


Cys Pro 


Asn 


Ser 


Ser 


lie 


Val 


Tyr 


Glu 


Ala 


Glu His 


Gin He Leu 






20 










25 




30 


His Leu 


Pro 


Gly 


Cys 


Leu 


Pro 


Cys 


Val 


Arg 


Val Gly 


Asn Gin Ser 








35 










40 




45 


Arg Cys 


Trp 


Val 


Ala 


Leu 


Thr 


Pro 


Thr 


Val 


Ala Val 


Ser Tyr lie 








50 










55 




60 


Gly Ala 


Pro 


Leu 


Asp 


Ser 


Leu 


Arg 


Arg 


His 


Val Asp 


Leu Met Val 








65 










70 




75 


Gly Ala 


Ala 


Thr 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val Gly 


Asp Leu Cys 








80 










85 




90 


Gly Gly 


Ala 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Ser Phe 


Gin Pro Arg 








95 










100 




105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser He 


Tyr Ala Gly 








110 










115 




120 


His lie 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met Met 


Met Asn Trp 








125 










130 




135 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Leu 


Leu 


Ala 


Gin 


Val Met 


Arg He Pro 








140 










145 




150 


Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His Trp 


Gly Val Leu 








155 










160 




165 


Val Gly 


Leu 


Ala 


Tyr 


Phe 


Ser 


Met 


Gin 


Ala 


Asn Trp 


Ala Lys Val 






170 










175 




180 


lie Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp Ala 










185 










190 







(2) INFORMATION FOR SEQ ID NO: 94: 

20 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

( D ) TOPOLOGY : unknown 

0 . (vi) ORIGINAL SOURCE: 

^ (A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: Z7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 

Val Asn Tyr His Asn Ala Ser Gly Val Tyr His He Thr Asn Asp 

5 10 15 

30 Cys Pro Asn Ser Ser He Met Tyr Glu Ala Glu His His He Leu 

20 25 30 

His Leu Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Gin Ser 

35 40 45 

Arg Cys Trp Val Ala Leu Thr Pro Thr Val Ala Ala Pro Tyr He 

50 55 60 

Gly Ala Pro Leu Glu Ser He Arg Arg His Val Asp Leu Met Val 
35 65 70 75 
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<jj.y Aid 


Aid. 


Thr 


vax 


Cys 


oer 








O ft 

oU 






fllxr /ll v 

vaiy uiy 


VcLX 


r*ne 


Leu 


vax 


vjxy 








Cx c 








lxp 


1 IJX 


llir 


ulU 


ASp 








ill) 




His Val 


Thr 


Gly 


His 


Ara 


Met 








125 




Ser Pro 


Thr 


Thr 


Thr 


Leu 


Val 








140 






Ser Thr 


Leu 


Val 


Asp 


Leu 


Leu 








155 






lie Gly 


Val 


Ala 


Tyr 


Phe 


Cys 








170 






lie Leu 


Val 


Leu 


Phe 


Leu 


Tyr 



10 185 



113 - 



Ala 


Leu 


Tyr 


lie Gly Asp 


Leu Cys 






o c 

OD 




90 




Mot" 


irlie 


oer rTie Gin 


Pro Arg 






XUU 




105 


uys 


Ash 


Cys 


oer xxe Tyr 


Ala Gly 






lib 




120 


7V"L a 
ni.a 


irp 


^ en 
ASp 


wex. raec W6l 


ash Trp 






IJU 




135 


Leu 


Ala 


Gin 


Val Met Arg 


lie Pro 






145 




150 


Thr 


Gly 


Gly 


His Trp Gly 


lie Leu 






160 




165 


Met 


Gin 


Ala 


Asn Trp Ala 


Lys Val 






175 




180 


Ala 


Gly 


Val 


Asp Ala 








190 







(2) INFORMATION FOR SEQ ID NO: 95: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

( C ) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: DK13 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 



Tyr Asn 


Tyr 


Arg 


Asn 


Ser 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 








5 










10 






15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Thr 


Asp 


Tyr 


His He Leu 








20 










25 


30 


His Leu 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Glu 


Gly 


Asn Lys Ser 








35 










40 






45 


Thr Cys 


Trp 


Val 


Ser 


Leu 


Thr 


Pro 


Thr 


Val 


Ala 


Ala 


Gin His Leu 








50 










55 






60 


Asn Ala 


Pro 


Leu 


Glu 


Ser 


Leu 


Arg 


Arg 


His 


Val 


Asp 


Leu Met Val 








65 










70 




75 


Gly Gly 


Ala 


Thr 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


lie 


Gly 


Asp Val Cys 








80 










85 






90 


Gly Gly 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Leu 


Phe 


Thr 


Phe 


Gin Pro Arg 








95 










100 






105 


Arg His 


Trp 


Thr 


Thr 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Thr Gly 








110 










115 






120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 


Ser Pro 


Thr 


Ala 


Thr 


Leu 


val 


Leu 


Ala 


Gin 


Leu 


Met 


Arg He Pro 








140 










145 






150 


Gly Ala 


Met 


Val 


Asp 


Leu 


Leu 


Ala 


Gly 


Gly 


His 


Trp 


Gly He Leu 



155 160 165 



(i) 



15 
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0 Val Gly lie Ala Tyr Phe Ser Met Gin Ala Asn Trp Ala Lys Val- 

170 175 180 

lie Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 
185 190 

(2) INFORMATION FOR SEQ ID NO: 96: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
10 (C) INDIVIDUAL ISOLATE: SA1 



15 



20 



25 



30 





(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID NO: 96: 




Vol 


Pro 


Tyr 


Arg 


Asn 


Ala 


Ser Gly 


Val Tyr His Val Thr 


Asn Asp 










5 






10 


15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val Tyr 


Glu Ala Asp Ser Leu 


lie Leu 










20 






25 


30 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro Cys 


Val Arg Gin Asp Asn 


Val Ser 










35 






40 


45 


Arg 


Cys 


Trp 


Val 


Gin 


He 


Thr Pro 


Thr Leu Ser Ala Pro 


Thr Phe 










50 






55 


60 


Gly 


Ala 


Val 


Thr 


Ala 


Pro 


Leu Arg 


Arg Ala Val Asp Tyr 


Leu Ala 










65 






70 


75 


Gly 


Gly 


Ala 


Ala 


Leu 


Cys 


Ser Ala 


Leu Tyr Val Gly Asp 


Ala Cys 










80 






85 


90 


Gly 


Ala 


Val 


Phe 


Leu 


Val 


Gly Gin 


Met Phe Thr Tyr Arg 


Pro Arg 










95 






100 


105 


Gin 


His 


Thr 


Thr 


Val 


Gin 


Asp Cys 


Asn Cys Ser He Tyr 


Ser Gly 










110 






115 


120 


His 


lie 


Thr 


Gly 


His 


Arg 


Met Ala 


Trp Asp Met Met Met 


Asn Trp 










125 






130 


135 


Ser 


Pro 


Thr 


Thr 


Ala 


Leu 


Leu Met 


Ala Gin Met Leu Arg 


He Pro 










140 






145 


150 


Gin 


Val 


Val 


lie 


Asp 


He 


lie Ala 


Gly Gly His Trp Gly 


Val Leu 










155 






160 


165 


Phe 


Ala 


Ala 


Ala 


Tyr 


Phe 


Ala Ser 


Ala Ala Asn Trp Ala 


Lys Val 










170 






175 


180 


Val 


Leu 


Val 


Leu 


Phe 


Leu 


Phe Ala 


Gly Val Asp Gly 












185 






190 





(2) INFORMATION FOR SEQ ID NO: 97: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
J:> (D) TOPOLOGY: unknown 
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0 (vi) ORIGINAL SOURCE : 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA4 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 
5 5 10 15 

Cys Pro Asn Ser Ser lie Val Tyr Glu Ala Asp Asn Leu He Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Arg Gin Asp Asn Val Ser 

35 40 45 

Lys Cys Trp Val Gin He Thr Pro Thr Leu Ser Ala Pro Asn Leu 

50 55 60 

Gly Ala Val Thr Ala Pro Leu Arg Arg Ala Val Asp Tyr Leu Ala 
10 65 70 75 

Gly Gly Ala Ala Leu Cys Ser Ala Leu Tyr Val Gly Asp Ala Cys 

80 85 90 

Gly Ala Val Phe Leu Val Gly Gin Met Phe Thr Tyr Arg Pro Arg 

95 100 105 

Gin His Thr Thr Val Gin Asp Cys Asn Cys Ser He Tyr Ser Gly 
110 lis 120 

His He Thr Gly His Arg Met Ala Trp Asp Met Met Met Asn Trp 
125 130 135 

Ser Pro Thr Thr Ala Leu Leu Met Ala Gin Leu Leu Arg He Pro 
140 145 150 

Gin Val Val He Asp He He Ala Gly Gly His Trp Gly Val Leu 
155 160 165 

Phe Ala Ala Ala Tyr Phe Ala Ser Ala Ala Asn Trp Ala Lys Val 
170 175 180 

20 He Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190 



15 



25 



30 



35 



(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA5 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 

Val Pro Tyr Arg Asn Ala Ser Gly Val Tyr His Val Thr Asn Asp 

5 10 15 

Cys Pro Asn Ser Ser He Val Tyr Glu Ala Asp Asn Leu He Leu 

20 25 30 

His Ala Pro Gly Cys Val Pro Cys Val Lys Glu Gly Asn Val Ser 

35 40 45 
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Arg 


Cys 


Trp 


Val 


Gin 
cn 


Gly 


H 1 i 


-it- -i 

vai 


Thr 


Aid 








bo 


Gly 


Gly 


Ala 


Ala 


Leu 

on 


Gly 


Ala 


Val 


Phe 


Leu 








QC 


Gin 


His 


Thr 


Thr 


Val 

11/1 


His 


He 


Thr 


Gly 


His 








125 


Ser 


Pro 


Thr 


Thr 


Ala 

140 


Gin 


Val 


Val 


lie 


Asp 
155 


Phe 


Ala 


Val 


Ala 


Tyr 
170 


Val 


Leu 


Val 


Leu 


Phe 
185 



- 116 - 



lie 


Thr 


Pro 


Thr 


Leu 

33 


Pro 


Leu 


Arg 


Arg 


Val 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Gin 


Met 


Phe 
i no 


Gin' 


Asp 


Cys 


Asn 


Cys 


Arg 


Met 


Ala 


Trp 


Asp 
i on 


Leu 


Val 


Met 


Ala 


Gin 
145 


lie 


lie 


Ala 


Gly 


Gly 
160 


Phe 


Ala 


Ser 


Ala 


Ala 
175 


Leu 


Phe 


Ala 


Gly 


Val 
190 





Ala Pro 


Asn Leu 






60 


V CLJL 


Asp Tyr 


Leu Ala 






75 


V O.X 


Gly Asp 


Ala Cys 






90 


Thr 


Tyr Arg 


Pro Arg 






105 


Qpr 


He Tyr Ser Gly 






120 


Met 


Met Met 


Asn Trp 






135 


Val 


Leu Arg 


lie Pro 




150 


His 


Trp Gly Val Leu 






165 


Asn 


Trp Ala 


Lys Val 






180 


Asp 


Gly 





15 (2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

( D ) TOPOLOGY : unknown 

20 (vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA6 



25 



30 



35 





(xi) 


SEQUENCE 


DESCRIPTION; 


: SEQ ID 


NO:99: 


Val 


Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val 


Thr Asn Asp 






5 










10 






15 


Cys 


Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp 


Leu He Leu 








20 










25 






30 


His 


Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Lys 


Asp 


Asn Val Ser 










35 










40 






45 


Arg 


Cys 


Trp 


Val 


His 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala 


Pro Ser Leu 




50 










55 






60 


Gly 


Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp 


Tyr Leu Ala 








65 










70 






75 


Gly 


Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly 


Asp Val Cys 






80 










85 






90 


Gly 


Ala 


Leu 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr 


Arg Pro Arg 








95 










100 






105 


Gin 


His 


Ala 


Thr 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He 


Tyr Ser Gly 










110 










115 






120 


His 


He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met 


Met Asn Trp 








125 










130 






135 
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Ser Pro Ala Thr Ala Leu Val Met Ala Gin Met Leu Arg lie Pro 

140 145 150 

Gin Val Val He Asp He He Ala Gly Gly His Trp Gly Val Leu 

155 160 165 

Phe Ala Ala Ala Tyr Phe Ala Ser Ala Ala Asn Trp Ala Lys Val 

170 175 180 

Val Leu Val Leu Phe Leu Phe Ala Gly Val Asp Ala 

185 190 



(2) INFORMATION FOR SEQ ID NO: 100: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: SA7 



15 



20 



25 



30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 



Val Pro 


Tyr 


Arg 


Asn' 
5 


Ala 


Ser 


Gly 


Val 


Tyr 
10 


His 


Val 


Thr Asn Asp 
15 


Cys Pro 


Asn 


Ser 


Ser 
20 


He 


Val 


Tyr 


Glu 


Ala 
25 


Asp 


Asn 


Leu He Leu 
30 


His Ala 


Pro 


Gly 


Cys 
35 


Val 


Pro 


Cys 


Val 


Arg 
40 


Gin 


Asn 


Asn Val Ser 
45 


Arg Cys 


Trp 


Val 


Gin 
50 


He 


Thr 


Pro 


Thr 


Leu 
55 


Ser 


Ala 


Pro Asn Leu 
60 


Gly Ala 


Val 


Thr 


Ala 
65 


Pro 


Leu 


Arg 


Arg 


Ala 
70 


Val 


Asp 


Tyr Leu Ala 
75 


Gly Gly 


Ala 


Ala 


Leu 
80 


Cys 


Ser 


Ala 


Leu 


Tyr 
85 


Val 


Gly 


Asp Ala Cys 
90 


Gly Ala 


Val 


Phe 


Leu 
95 


Val 


Gly 


Gin 


Met 


Phe 
100 


Ser 


Tyr 


Arg Pro Arg 
105 


Gin His 


Thr 


Thr 


Val 
110 


Gin 


Asp 


Cys 


Asn 


Cys 
115 


Ser 


He 


Tyr Ser Gly 
120 


His He 


Thr 


Gly 


His 
125 


Arg 


Met 


Ala 


Trp 


Asp 
130 


Met 


Met 


Met Asn Trp 
135 


Ser Pro 


Thr 


Thr 


Ala 
140 


Leu 


Val 


Met 


Ala 


Gin 
145 


Leu 


Leu 


Arg He Pro 
150 


Gin Val 


Val 


He 


Asp 
155 


He 


He 


Ala 


Gly 


Gly 
160 


His 


Trp 


Gly Val Leu 
165 


Phe Ala 


Ala 


Ala 


Tyr 
170 


Phe 


Ala 


Ser 


Ala 


Ala 
175 


Asn 


Trp 


Ala Lys Val 
180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 










185 










190 







(2) INFORMATION FOR SEQ ID NO: 101: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(Vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 
(C) INDIVIDUAL ISOLATE: SA13 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 



Val Pro 


Tyr 


Arg 


Asn 


Ala 


Ser 


Gly 


Val 


Tyr 


His 


Val Thr Asn Asp 








5 










10 




15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Tyr 


Glu 


Ala 


Asp 


Asp Leu He Leu 






20 










25 




30 


His Ala 


Pro 


Gly 


Cys 


Val 


Pro 


Cys 


Val 


Arg 


Gin 


Gly Asn Val Ser 








35 










40 




45 


Arg Cys 


Trp 


Val 


Gin 


He 


Thr 


Pro 


Thr 


Leu 


Ser 


Ala Pro Ser Leu 




50 










55 




60 


Gly Ala 


Val 


Thr 


Ala 


Pro 


Leu 


Arg 


Arg 


Ala 


Val 


Asp Tyr Leu Ala 








65 










70 




75 


Gly Gly 


Ala 


Ala 


Leu 


Cys 


Ser 


Ala 


Leu 


Tyr 


Val 


Gly Asp Ala Cys 








80 










85 




90 


Gly Ala 


Val 


Phe 


Leu 


Val 


Gly 


Gin 


Met 


Phe 


Thr 


Tyr Ser Pro Arg 






95 










100 




105 


Arg His 


Asn 


Val 


Val 


Gin 


Asp 


Cys 


Asn 


Cys 


Ser 


He Tyr Ser Gly 








110 










115 




120 


His He 


Thr 


Gly 


His 


Arg 


Met 


Ala 


Trp 


Asp 


Met 


Met Met Asn Trp 








125 










130 




135 


Ser Pro 


Thr 


Thr 


Ala 


Leu 


Val 


Met 


Ala 


Gin 


Leu 


Leu Arg He Pro 








140 










145 




150 


Gin Val 


Val 


lie 


Asp 


lie 


He 


Ala 


Gly 


Ala 


His 


Trp Gly Val Leu 








155 










160 




165 


Phe Ala 


Ala 


Ala 


Tyr 


Tyr 


Ala 


Ser 


Ala 


Ala 


Asn 


Trp Ala Lys Val 








170 










175 




180 


Val Leu 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gly 


Val 


Asp 


Ala 








185 








190 





(2) INFORMATION FOR SEQ ID NO: 102: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: homosapiens 

(C) INDIVIDUAL ISOLATE: HK2 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 
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10 



25 



Leu Thr 


Tyr 


Gin 


Asn 


Ser 


Ser 


Gin 


Leu Tyr His Leu 


Thr Asn Asp 








1 








10 


15 


Cys Pro 


Asn 


Ser 


Ser 


He 


Val 


Leu 


Glu Ala Asp Ala 


Met lie Leu 








20 








25 


30 


His Leu 


Pro 


Gin 


Cys 


Leu 


Pro 


Cys 


Val Arg Val Asp 


Asp Arg Ser 








35 








40 


45 


Thr Cys 


Trp 


His 


Ala 


Val 


Thr 


Pro 


Thr Leu Ala lie 


Pro Asn Ala 








50 








55 


60 


Ser Thr 


Pro 


Ala 


Thr 


Gin 


Phe 


Arg 


Arg His Val Asp 


Leu Leu Ala 








65 








70 


75 


Gin Ala 


Ala 


Val 


Val 


Cys 


Ser 


Ser 


Leu Tyr He Gin 


Asp Leu Cys 








80 








85 


90 


Gin Ser 


Leu 


Phe 


Leu 


Ala 


Gin 


Gin 


Leu Phe Thr Phe 


Gin Pro Arg 








95 








100 


105 


Arg His 


Trp 


Thr 


Val 


Gin 


Asp 


Cys 


Asn Cys Ser lie 


Tyr Thr Gin 








110 








115 


120 


His Val 


Thr 


Gin 


His 


Arcr 


Met 


Ala 


Trn Asd Met Met 


Met Asn Tra 








125 








130 


■JL .J ~J 


Ser Pro 


Thr 


Thr 


Thr 


Leu 


Val 


Leu 


Ser Ser lie Leu 


Arg Val Pro 








140 








145 


150 


Glu He 


Cys 


Ala 


Ser 


Val 


He 


Phe 


Gin Gin His Trp 


Gin He Leu 








155 








160 


165 


Leu Ala 


Val 


Ala 


Tyr 


Phe 


Gin 


Met 


Ala Gin Asn Trp 


Leu Lys Val 








170 








175 


180 


Leu Ala 


Val 


Leu 


Phe 


Leu 


Phe 


Ala 


Gin Val Glu Ala 










185 








190 





15 



(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

20 (A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: 

GCGTCCGGGT TCTGGAAGAC GGCGTGAACT ATGCAACAGG 40 



(2) INFORMATION FOR SEQ ID NO: 104; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 104: 

AGGCTTTCAT TGCAGTTCAA GGCCGTGCTA TTGATGTGCC 40 



35 



(2) INFORMATION FOR SEQ ID NO: 105: 
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° (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 

5 AAGACGGCGT GAACTATGCA ACAGGGAACC TTCCTGGTTG 40 

(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{ D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

AGTTCAAGGC CGTGCTATTG ATGTGCCAAC TGCCGTTGGT 40 



15 



(2) INFORMATION FOR SEQ ID NO: 107: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 

AAGACGGCGT GAATTCTGCA ACAGGGAACC TTCCTGGTTG 40 



25 (2) INFORMATION FOR SEQ ID NO: 108: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : 1 inear 

30 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

AGTTCAAGGC CGTGGAATTC ATGTGCCAAC TGCCGTTGGT 40 



(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:109: 

ARCTYCGACG TYACATCGAY CTGCTYGTYG GRAGYGCCAC CC 

(2) INFORMATION FOR SEQ ID NO: 110: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDEDNESS : single 
10 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 



RCARGCCRTC TTGGAYATGA TCGCTGGWGC Y 



15 (2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 

CRATACGACR YCAYGTCGAY TTGCTCGTTG GGGCGGCTRY YT 



(2) INFORMATION FOR SEQ ID NO: 112: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

30 RCAAGCTRTC RTGGAYRTGG TRRCRGGRGC C 



25 



(i) 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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0 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 

TTGCGGACKC ACATYGACAT GGTYGTGATG TCCGCCACGC 40 

5 (2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

10 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 

GATGCGCGTT CCCGAGGTCA TCWTAGACAT CRTYRGCGGR GCD 43 

(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

20 AATGGCACCY TGCRCTGCTG GATACAAGTR ACACCTAATG TGGCTGTGAA 50 
ACAC 54 



15 



(2) INFORMATION FOR SEQ ID NO: 116: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 
23 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116 : 

ARCTAGYC CTYSARGTYG TCTTCGGYGG Y 31 

30 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D ) TOPOLOGY : 1 inear 



WO 95/01442 



PCT/US94/07320 



- 123 - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117: 

GCCAACGTCT CTCGATGTTG GGTGCCGGTT GCCCCCAATC TCGCCATAAG 
TCAA 



(2). INFORMATION FOR SEQ ID NO: 118: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : 1 inear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

10 

AAGGGCCTGC GAGCACACAT CGATATCATC GTGATGTCTG CTACGG 



(2) INFORMATION FOR SEQ ID NO: 119: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

TTGGTGCGCA TCCCGGAAGT CATCTTGGAT ATTGTTACAG GAGGT 

20 

(2) INFORMATION FOR SEQ ID NO: 120: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

AGTCAGGTAY GTCGGAGCAA CCACCGCYTC GATACGCAGT 



30 (2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(i) 



15 



(i) 



25 



35 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 121: 
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° AGCCTTCACG TTCAGACCKC GTCGCCATCA AACRGTCCAG ACCTGT 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 75 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

TCCCCCGCYG TGGGTATGGT GGTRGCGCAC RTYCTGCGDY TGCCCCAGAC 
CKTGTTYGAC ATAMTRGCYG GGGCC 

10 

(2) INFORMATION FOR SEQ ID NO: 123: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 

ACGCCGGTGA CGCCTACAGT GGCTGTCGCA CACCCGGGC 



20 (2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

ATGAGGGTCC CCACAGCCTT TCTCGACATG GTTGCCGGAG GC 



(2) INFORMATION FOR SEQ ID NO: 125: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 



(i) 



15 



CGCGCCCTAT CCCAACGCAC CGTTAGAGTC CATGCGCAGG 
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(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 126: 

TCAGATCTTA CGGATCCCCT CTATCCTAGG TGACTTGCTC ACCGGGGGT 



(2) INFORMATION FOR SEQ ID NO: 12 7: 

10 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

CAGTCACGCT GCTGGGTGGC CCTTACTCCC ACCGTGGCGG YGYCTTATAT 
CGGT 



(2) INFORMATION FOR SEQ ID NO: 12 8: 

20 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12 8: 

25 TAGCACTCTG GTRGAYCTAC TCRCTGGAGG G 



(2) INFORMATION FOR SEQ ID NO: 129: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:129: 

AAGTCTACAT GCTGGGTGTC TCTCACCCCC ACCGTGGCTG CGCAACATCT 
35 GAAT 
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(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

AGGCGCCATG GTCGACCTGC TTGCAGGCGG C 31 



(2) INFORMATION FOR SEQ ID NO: 131: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

15 TCAGCCCCGA VYYTCGGAGC GGTCACGGCT CCTCTTCGGA GGG 43 



(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 
20 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

TGYTACGGAT YCCCCARGTG GTCATHGACA TCATWGCCGG GGSC 44 



25 



(2) INFORMATION FOR SEQ ID NO: 133: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
30 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 

CATACCAAAT GCTTCCACGC CCGCAACGGG ATTCCGCAGG 40 



35 



(2) INFORMATION FOR SEQ ID NO: 134: 
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° (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

5 TCTTCTTGCG GGCGCCGCAG TGGTTTGCTC ATCCCTG 37 

(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 52 base pairs 
10 (B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 



15 



ATCTAGCATC TTGAGGGTAC CTGAGATTTG TGCGAGTGTG ATATTTGGTG 50 
GC 52 

(2) INFORMATION FOR SEQ ID NO: 136: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

Trp lie Gin Val Thr Pro Asn Val Ala Val Lys His Arg Gly Ala 

5 10 15 

Leu Thr His Asn Leu Arg Xaa His Xaa Asp Xaa lie Val Met Ala 
^ 20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 13 7: 

(i) SEQUENCE CHARACTERISTICS: 

30 (A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13 7: 



35 



Trp Val Pro Val Ala Pro Asn Leu Ala lie Ser Gin Pro Gly Ala 

5 10 15 
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Leu Thr Lys Gly Leu Arg Ala His lie Asp He He Val Met Ser 

20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 13 8: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 138: 

10 Trp He Pro Val Xaa Pro Asn Val Ala Val Xaa Xaa Pro Gly Ala 

5 10 15 

Leu Thr Gin Gly Leu Arg Thr His He Asp Met Val Val Met Ser 

20 25 30 

Ala Thr Leu 



15 (2) INFORMATION FOR SEQ ID NO: 139: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D ) TOPOLOGY : unknown 

20 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Trp Thr Xaa Val Thr Pro Thr Val Ala Val Arg Tyr Val Gly Ala 

5 10 15 

Thr Thr Ala Ser He Arg Ser His Val Asp Leu Leu Val Gly Ala 

20 25 30 

Ala Thr Xaa 



25 



(2) INFORMATION FOR SEQ ID NO: 140: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 
30 (D) TOPOLOGY: unknown 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 140: 

Tip Val Ala Leu Xaa Pro Thr Leu Ala Ala Arg Asn Xaa Xaa Xaa 

5 10 15 

Xaa Thr Xaa Xaa He Arg Xaa His Val Asp Leu Leu Val Gly Ala 
„ 20 25 30 

" Ala Xaa Phe 
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(2) INFORMATION FOR SEQ ID NO: 141: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 
5 (D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: 

Trp Val Xaa Xaa Xaa Pro Thr Val Ala Thr Arg Asp Gly Lys Leu 

5 10 15 

Pro Xaa Xaa Gin Leu Arg Arg Xaa lie Asp Leu Leu Val Gly Ser 

20 25 30 

10 Ala Thr Leu 

(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 



15 



Trp Thr Pro Val Thr Pro Thr Val Ala Val Ala His Pro Gly Ala 

5 10 15 

Pro Leu Glu Ser Phe Arg Arg His Val Asp Leu Met Val Gly Ala 
20 2 0 2 5 3 0 

Ala Thr Leu 



(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 143: 

Trp Val Ala Leu Thr Pro Thr Val Ala Xaa Xaa Tyr He Gly Ala 
30 5 10 15 

Pro Leu Xaa Ser Xaa Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Val 



(2) INFORMATION FOR SEQ ID NO: 144: 

(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH : 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 144: 

Trp Val Ser Leu Thr Pro Thr Val Ala Ala Gin His Leu Asn Ala 

5 10 15 

Pro Leu Glu Ser Leu Arg Arg His Val Asp Leu Met Val Gly Gly 

20 25 30 

Ala Thr Leu 



10 



15 



20 



25 



(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 

Trp Val Pro Leu Thr Pro Thr Val Ala Ala Pro Tyr Pro Asn Ala 

5 10 15 

Pro Leu Glu Ser Met Arg Arg His Val Asp Leu Met Val Gly Ala 

20 25 30 

Ala Thr Met 



(2) INFORMATION FOR SEQ ID NO: 146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 146 : 



Trp Val Xaa lie Thr Pro Thr Leu Ser Ala Pro Xaa Xaa Gly Ala 

5 10 15 

Val Thr Ala Pro Leu Arg Arg Xaa Val Asp Tyr Leu Ala Gly Gly 

20 25 30 

30 Ala Ala Leu 

(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 33 amino acids 

(B) TYPE: amino acid 

„ (C) STRANDEDNESS: unknown 

J:) (D) TOPOLOGY: unknown 
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° (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

Trp His Ala Val Thr Pro Thr Leu Ala lie Pro Asn Ala Ser Thr 

5 10 15 

Pro Ala Thr Gly Phe Arg Arg His Val Asp Leu Leu Ala Gly Ala 

20 25 30 

Ala Val Val 

5 

(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
10 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

Thr Leu Thr Met lie Leu Ala Tyr Ala Ala Arg Val Pro Glu Leu 

5 10 15 

Xaa Leu Xaa Val Val Phe Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 149: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 149: 

Thr Thr Thr Met Leu Leu Ala Tyr Leu Val Arg lie Pro Glu Val 

5 10 15 

lie Leu Asp lie Val Thr Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 150: 



25 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

30 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Thr Xaa Thr Xaa lie Leu Ala Tyr Xaa Met Arg Val Pro Glu Val 

5 10 15 

lie Xaa Asp lie Xaa Xaa Gly Ala 

20 



35 
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(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 

(D) TOPOLOGY: unknown 

{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

Ala Val Gly Met Val Val Ala His Xaa Leu Arg Leu Pro Gin Thr 

5 10 15 

Xaa Phe Asp lie Xaa Ala Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 152: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 

Thr Xaa Ala Leu Val Xaa Ser Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Xaa Asp Xaa Val Xaa Gly Ala 

20 



(2) INFORMATION FOR SEQ ID NO: 153: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : unknown 
25 (D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 

Thr Xaa Ala Leu Val Xaa Ala Gin Leu Leu Arg Xaa Pro Gin Ala 

5 10 15 

Xaa Leu Asp Met He Ala Gly Ala 
30 20 



(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : unknown 
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(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 

Thr Thr Thr Leu Leu Leu Ala Gin lie Met Arg Val Pro Thr Ala 

5 10 15 

Phe Leu Asp Met Val Ala Gly Gly 

20 



(2) INFORMATION POR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

Thr Thr Thr Leu Xaa Leu Ala Gin Val Met Arg lie Pro Ser Thr 

5 10 15 

Leu Val Asp Leu Leu Xaa Gly Gly 

20 

(2) INFORMATION FOR SEQ ID NO: 156: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

20 (C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 6: 

Thr Ala Thr Leu Val Leu Ala Gin Leu Met Arg lie Pro Gly Ala 

5 10 15 

Met Val Asp Leu Leu Ala Gly Gly 

20 



25 



(2) INFORMATION FOR SEQ ID NO: 157: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 
30 (B) TYPE: amino acid 

(C) STRANDEDNESS: unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15 7: 



35 



Thr Ser Ala Leu lie Met Ala Gin lie Leu Arg lie Pro Ser lie 

5 10 15 
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Leu Gly Asp Leu Leu Thr Gly Gly 

20 



(2) INFORMATION FOR SBQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS : 

5 (A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRAND EDNESS : unknown 

( D ) TOPOLOGY : unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 

Xaa Thr Ala Leu Xaa Met Ala Gin Xaa Leu Arg lie Pro Gin Val 
10 5 10 15 

Val lie Asp He He Ala Gly Xaa 

20 

(2) INFORMATION FOR SEQ ID NO: 159: 



15 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNES S : unknown 

(D) TOPOLOGY: unknown 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

Thr Thr Thr Leu Val Leu Ser Ser He Leu Arg Val Pro Glu He 
20 5 10 15 

Cys Ala Ser Val He Phe Gly Gly 

20 



25 



30 



35 
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CLAIMS 

1. A cDNA of the envelope 1 gene of the 
hepatitis C virus wherein the cDNA has a sequence selected 
from the group consisting of SEQ ID NO:l through SEQ ID 
NO:51- 

5 

2. A recombinant hepatitis C virus envelope 1 
protein encoded by a gene whose sequence includes a. 
sequence selected from the group consisting of SEQ ID NO:l 
through SEQ ID NO: 51. 

10 

3 . A recombinant protein having an amino acid 
sequence selected from the group consisting of SEQ ID NO: 52 
through SEQ ID NO: 102. 

4. A method for the recombinant DNA- directed 
of at least one complete envelope 1 protein of 
C virus said method comprising: 

culturing a transformed or transfected host 
organism containing a DNA sequence capable 
of directing the host organism to produce an 
envelope 1 protein under conditions such 
that the protein is produced, said protein 
exhibiting substantial homology to a protein 
comprising the amino acid sequence selected 
from the group consisting of SEQ ID NO: 52 
through SEQ ID NO: 102. 



15 



20 



25 



synthesis 
hepatitis 



5. The method of claim 4, wherein the host 
organism is transfected with a recombinant eukaryotic 
expression vector. 

6. The method of claim 4, wherein the 
eukaryotic vector is a baculovirus vector. 



35 



7. The method of claim 4, wherein the host 
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0 organism is a eukaryotic cell. 

8. The method of claim 7, wherein the 
eukaryotic cell is an insect cell. 

5 9. A recombinant expression vector comprising a 

cDNA sequence selected from the group consisting of SEQ ID 
N0:1 through SEQ ID NO: 51. 

10. A host organism transformed or transfected 
10 with a recombinant expression vector according to claim 9. 

11. A method of detecting antibodies to HCV in a 
biological sample suspected of containing said antibodies 
comprising: 

15 (a) contacting the sample with at least one 

recombinant protein of claim 3 to form 
an immune complex with the antibodies; 
and 

(b) detecting the presence of the immune 
20 complex. 

12 . The method of claim 11 wherein the 
biological sample is selected from the group consisting of 
serum, saliva or lymphocytes or other mononuclear cells. 

25 

13. The method of claim 11, wherein the 
recombinant envelope 1 protein is bound to a solid support. 

14. The method of claim 11, wherein the immune 
30 complex is detected using a labeled antibody. 

15. A hepatitis C virus hit comprising: at least 
one recombinant protein comprising an amino acid sequence 
selected from the group consisting of: SEQ ID NO: 52 through 

35 SEQ ID NO: 102. 
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16. A pharmaceutical composition comprising at 
least one recombinant protein of claim 3 and a suitable 
excipient, diluent or carrier. 

17. A method of preventing hepatitis C 

5 infection, comprising administering the pharmaceutical 

composition of claim 16 to a mammal in an effective amount 
to stimulate the production of protective antibody. 

18. A vaccine for immunizing a mammal against 
10 hepatitis C infection, comprising at least one recombinant 

protein according to claim 3 in a pharmacologically 
acceptable carrier. 



19. A method for detecting the presence of the 
15 hepatitis C virus via a reverse transcription- polymerase 
chain reaction process, wherein the primers are selected 
from the sequences shown in SEQ ID NO: 103 through in SEQ ID 
NO:108. 

20 20. Substantially isolated and purified primers, 

wherein said primers have nucleic acid sequences selected 
from the group consisting of SEQ ID NO: 103 through SEQ ID 
NO: 108 . 



25 21. A diagnostic kit for use in detecting the 

presence of hepatitis C virus, said kit comprising: primers 
having nucleic acid sequences selected from the group 
consisting of SEQ ID NO: 103 through SEQ ID NO: 108. 

30 22. A method for determining the genotype of a 

hepatitis C virus, said method comprising: 

(a) amplifying RNA via reverse 

transcription-polymerase chain reaction 
to produce amplification products; 
35 (b) contacting said products with at least 
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° one genotype- specif ic oligonucleotide; 

and 

(c) detecting complexes of said products 
which bind to said oligonucleotide (s) . 

5 23. The method of claim 22, wherein said 

amplif ication of step (a) uses primer having a sequence 
according to SEQ ID NO: 103 through SEQ ID NO: 10 8. 

24. The method of claim 23, wherein said 

10 oligonucleotide of the step (b) is a nucleic acid sequence 
selected from the group consisting of SEQ ID NO: 109 through 
SEQ ID NO: 135. 

25 . Substantially isolated and purified 
15 oligonucleotides, wherein said oligonucleotides have 

nucleic acid sequences selected from the group consisting 
of SEQ ID NO: 109 through SEQ ID NO: 135. 

26. A diagnostic kit for determining the 
20 genotype of a hepatitis C virus, said kit comprising 

primers selected from the group consisting of SEQ ID NO: 103 
through SEQ ID NO: 108 and hybridization probes selected 
from the group consisting of SEQ ID NO: 109 through SEQ ID 
NO:135. 

25 

27. A substantially purified and isolated 
peptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 13 6 through SEQ ID NO: 159. 

30 28. A method of detecting antibodies specific 

for a single genotype of HCV, said method comprising: 

(a) contacting a biological sample with at 
least one peptide of claim 27 to form 
an immune complex with the antibodies, 
35 and 
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(b) detecting the presence of the immune 
complex. 

29. The method of claim 28, wherein the 
biological sample is selected from the group consisting of 

5 serum, saliva or lymphocytes or other mononuclear cells. 

30. The method of claim 28, wherein said peptide 
is bound to a solid support. 

10 31. The method of claim 28 , wherein the immune 

complex is detected using a labelled antibody. 

32. A kit for use in detecting hepatitis C virus 
antibodies, said kit comprising: at least one peptide 

15 selected from the group consisting of SEQ ID NO: 13 6 through 
SEQ ID N0:159. 

33. A pharmaceutical composition comprising at 
least one peptide of claim 27 and a suitable excipient, 

20 diluent or carrier. 

34. A method of preventing hepatitis C 
infection, comprising administering the pharmaceutical 
composition of claim 33 to a mammal in an effective amount 

25 to stimulate production of a protective antibody. 

35. A vaccine for immunizing a mammal against 
hepatitis C infection, comprising at least one peptide 
according to claim 27 in a pharmaceutically acceptable 

30 carrier. 



35 
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TACCAAGTGCGCAACTCCACGGGGCT 

! 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 f I Mil II MM Mill III! MM li 

TACCAAGTGOXAACTCCaCCXra^ 

MINIM MMMMMIMMMMMMMMMMMMMMMMMMMMI 

TACCAAGTaCGCAACTCCACGGGGCTTTACCATCT 

MMMI MMIMI M M 1 1 1 1 1 M M M M I II I M M M 1 1 1 M II MMMI 

CACCAAGTCCGCAACTCTACAGGGCTTO 

1 1 M MMMI MMIMMMIMI IMMIMIMMMI MMIMMMIMMMI 

OUXAAGTCCGCAACTCTACAGTC 

IMMM MMMM I 1 1 1 1 M I M I M M M M M M M M M M MMMI 

TACCAACTACGCAACTCCtCGGGCCTcTA^ 

MMMMMMMMM MMMI MMMMMMIMM I M I M 1 1 1 1 1 1 1 1 I 

TACCAAGTACGCAACTCCaCGGGCCTTT^ 

MMMMMMMMM II 1 1 II I II 1 1 II I II II I II 1 1 1 1 MMMIMIMM I 

TACCAAGTAOX3lACTCCtC^ 
tACCAAGT-CGCAACTCcaCgGGgCTtT^ 



TtGTGTACGAGaCaGCtGATGCtATCCTaCACgCTCCGGGaTGTGTCCCTTTC 

I MIMIMI I M Mill Mill III IIIIIII MMM MMMIMM II 

TcGTGTACGAGGCGGCCGATGCCATCCTC^ 

IMM II III II II III I III llllll I II llllll III llllll MMMIMIMM 

TTGTGTJUTGAGGCGGCCGATC^ 

MMMI I II II 1 1 II II II II M I II II 1 1 II I MIMMIMI 1 1 M 1 1 1 1 1 1 ! 1 1 f 

TTGTGTACGAGGCGGCCCUVTGCCATCCrr^ 

II III MMM MMMIMMI IMMM M II I II II II II II M II II II II II I II 

TTGTCTACGAGGCGGCCGATGCCATCCTC 

llllll M II II II M 1 1 II I M I II I M 1 1 I II M M MIMIMM I M MMM 

TTGTGTACGAGGCGGCCGATGCCATCCTGCAtTCT 

MIMMIMI MMMM IIIIIII M Mill II 1 1 1 II I II 1 1 1 1 II II 1 1 1 II 

TTGTGTACGAGACGGCCGATaCCATCCTACACTCT 

1 1 1 1 1 1 II I M II II 1 1 1 1 1 Mil MIMMIMI M M M III II MMM M MM 

TtGTGTACGAGgCgGCcGATgCcATcCTgCAc - CtCCgGGgTGTGTcCCTTGCGTTCGcGA. 



GGGTAACacCTCGAGGTGTIXX^TGGCGA^^ 

IMMM 1 1 1 1 1 1 MJMJJL 1 1 1 1 1 1 1 f 1 1 1 1 1 1 f 1 1 ! 1 1 1 1 i 1 1 i ! 1 1 1 1 1 1 MMM 

GGGTAACGtCTCGAGGTOITGGGTKOGATC^ 

MMMM 1 1 II I MMM M 1 1 II II I II II I II 1 1 1 1 1 1 II II I II II II MMM 

GGGTAACGCtTCGAGGTGTTGGGTGGCGATGACCC^ 

IIIIIII I llllllllllllllllll ! 1 1 1 ! I M ! I II 1 1 1 1 1 1 1 1 1 1 II 1 1 M I ! 1 1 

GGGTAACaCCTCGAGGTXnTTGGGTGGCGGTGACCCCCACGGTGG^ 

IIIIIII Ml M IMMMMMMMIMMIM MM II MMIMI II II I MM 1 1 

GGGTAACGCCTCGAGGTGTTGGGTGGC^TC^CCCCCA^ 

M MMMIMMI I M 1 1 1 1 1 1 1 1 1 1 1 1 II II II 1 1 II II 1 1 1 II M 1 1 II II 1 1 1 

GGGTAACGCCTCGAaATGTTGGGTGGCGGTGG CCCCCACGGTGGCCACCAGGGACGGCAAg 

MMMIMMIM IMMMMI MMMMIMM II MMMMMMIMM 

GGGTAACGCCTCGAgATGTTGGGTGcCGGTGGCCCCCAC 

II I MM Ml MIMIMI IMMMIMMMM IMM llllllllllll 

GGaTggCGCCcOSAagTGITGGGTGgCGGTGGCCCC^^ 
GGgrTaaCgcctCGAggTGTTGGGTGgCGgTGaCCCCC^ 
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CTCCCCgCAaCGC34GCTTCGACGTtACATCGATCroCTt(?rCGGGAGcGCCACCCTCTCTT 

II! 1 1 || M lllllll II II ! 1 1 II I llll Illl II III! II I i 1 1 1 1 II MI 11 

CTCCCCACAgCGCAGCTTCGACGTCACATCGATCTGCTcGTCGGGAGtGCCACCCTCTGTT 



II 



II 



1 I I I I 1 I I I 
CCACCCTCTGTT 

llllllllllll 
!CACCCTCTGCT 

Ml Ml Mi! 



MMIMI IMMIIMMIMI 

CTCCCCACAACGCAal 

IMMMMMIM II IIIIIIIIIIIMI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CTCCCCACAACGCAGCTcOGACGTCACATCGACl 

lllllllllllllllll MMMMMMM MM M MMIMM MMMIMIMI 

CTCCCCACAACGCAGCTTCGACGTCACATCGACCTGCTTCTC 

llllll lllllllllllllllllllllllll I i f IN Ml 1 1 1 1 M 1 1 i 1 1 1 ( I 

CTCCCCGCAACGCAGCTTCGACGTC^^ 

lllllll I II IMMMIMIIMMMM M Mill MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CTCCCCGCAACGCAGCITOSACCT^ 

Mill MMM I MMMMMMM MMI M II Mill II 1 1 1 1 1 1 1 1 1 1 1 1 1 

CTCCCtGCAACGCAGCTTCGACGTCACATCGATCTG 
CTCCCC - CAaCGCAgCTtCGACGTcACATCGAtCTCCTC 



CGGCCCTCTACGTGGGGGACtTCTI^GGGTCTGTCVr^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i IlifM II i M 

CXXSCCCTCTACGTGGGGGACCTGTGa^TCrrCT 

IIIIIIIIMIIMIIIIII I M I ! 1 1 1 1 1 1 ! M 1 1 MMMIMIMI MMI llll 

CXMCCCTCTACGTGGGGGACtTGTGCGGGTCTGTCTTCCTTGT 

I II 1 1 1 1 1 M 1 1 1 1 1 1 M ! I MMM M I MM 1 1 M Ml II M MM M M IMIMM 

CGGCCCTCTACGTGGGGGACcTGTGCGGGTCTGTCTTCCT 

MIMIIMI lllllllll 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 M 1 1 1 M I Mil I III I II II I 

CGGCCCTCTATGTGGGGGACtTGTGCGGGTCTGTCTTCC^^ 

I M Ml M II I MMIMM Mllllllllllllll llllll MM IMIMM I 

CGGCCCTCT A TGrrGGGGGACclCTSCGGGTCr^ ^ 

MIMIIMI lllllllll 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 f f ! M Mill II IIMIMI I 

CGGCCCTCTAcGTGGGGGACtTGTGCGGGTCTGTCTTTCT 
a^CCCTCTAcGTGGGGGAC-TGTGCGGGTCTGTC^ 



CTCTCCCAGGCGCC t CTGGACGACGCAAGaCTGCAATTGTTCTATCTATCCcGGC^^ 

MMMMMMM MMMMMMM II II II 1 1 II I II II 1 1 II II MMIMM 

CTCTCCCAGGCGCCACTCXSACGACGa^ 

IMIMIII lllllll IMIMMII 1 1 1 1 1 1 1 1 1 1 1 f f 1 1 1 1 1 1 1 1 1 1 MMIMM 

CTCTCCGAGaCGCCACTGGACGACGCAgGGCTGC^ 

lllllllll I lllllllll Mill I 1 1 1 1 1 1 1 ! 1 1 1 1 MIMIMIIMMMM 

CTCTCCCAGGCaCCACTGGACAACGCAAGACTGCAATTGTTC 

IMIMMII MMMMIIIMIIIMMM IMIMM MIMIMIIMMMM 

tTCTCCCy^CGCCACTGGACAACGCAAGACTGCAATTG 

II Mill II II I M I M 1 1 1 1 1 1 II 1 1 1 1 M II MIIMIMM 1 1 1 1 1 1 1 1 M I 

CTTCCCCJUGaCGCC^CTWauaUlCGCAAGACTGCAA 

IMIMIII 1 1 II 1 1 1 1 1 1 1 1 M M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 11 E 1 1 It 1 1 1 

CTX:CCCO^GGCGCC^CnT3GACAACGCAAGACTGC^ 

I MMI MM MIMM IMIMM MM M M MMMMMMM 1 1 1 1 1 1 1 1 Ml 

CTCCC CCAGGCGCCACTGGACAACGCAAGACTG tAACTGTTCTATCTAt CCCGGCCAcATA 
CTCtCCCAGgCgCCaCTCGACaACGCAaGaCTGcAAtTGTTCtATCTA 
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FIGURE 1A 
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367 ACGGGTCAtCGCATGGCaTGGGATATGATGATGAACT^ 

lllllll! MINIM lllllllllllllllillillilllllll Mill llllll 

367 ACGG<7rcACCGCATGGCgfrGGGAT^ 

lllllllllllllllll lllllllllllllllillillilllllll II Ml I II II I 

367 ACGGGTCACCGCATt^aTGGGATATGATGATGA 

lllll 1 1 1 1 1 1 1 1 1 1 1 MMMMM M M MMM MM M MM I Ml MM I 

367 ACGGGcCACCGCATCGCgTGGGLATilTOATG 

lllll lllll Mill I I I t I 1 I I ! I 1 I 1 I t I I I I i 1 | I I I ! 1 ! I | I I I I ! I 1 I I I 1 I 
367 ACGGGaCACCGtATGGCaTGGGOTATGATGATGA 

lllll II II Mill I ! 1 f i 1 1 1 ! 1 1 i 1 1 1 1 1 1 ! 1 f 1 1 1 1 1 1 1 1 IMIMMIMI 

367 ACGGGTCAtCGcATGGCgTGGGATATGMX^TC 

IMMIM II Mill MMMMM MM MMMIMMMM I I Ml MMM 

367 ACGQ^ 

367 ACGGGTCACCG cATGGCAXGGGATATCtflTGATGAACTG^ 

ACGGG t CAcCGcATGGCaTGGGATATCATGAT^ - GCgCTGGTag 
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TAGCTCAGCTCCTCCGGATCCCaCAAGCCATCTTGGA 

1 1 f 1 1 1 1 1 f 1 1 1 1 i 1 1 1 1 1 1 1 f IMMMMMIM f 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 ! f ! 1 1 1 ! f 

TAGCTC^CXGCTCCGGATCCCgCAAGCCATCrTGGA 

MMIIIIIIIIMMIMMI II II II llllll Mill II II MMMMM lllllll 

TAGCTCAGCTGCTCCGGATCCCACAAGCCATCTTGGACATGATC 

M M M IM M MMMM MM M IMM M MMMMM M MM MM IMMIM 

TAGCTCAGCTGCTCCGGATCCCMS^CCATC^^ 

I i I I 1 I I I I I t I f I I 1 I 1 I 1 f 1 1 I I t I I I 1 f 1 i f t f f I I I I I 1 I I I I I lilllllllll 
TX^CTC^CTGCTCCGGATCCCACAAGCCATCTTG^ 

MM I t 1 1 I f I I 1 llllll! II I I I 1 I I I I I I I I I I I I I I I I I f I lilllllllll 
TGGCgCAGCTCCTCAGGATCCCGCAgGCCATCTTGGAC 

I ii 1 1 1 1 1 1 1 1 1 1 1 1 iiiiiii ill 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 

TAGCTCAGCTGCTCAGGgTCCCGCAAGCCGTCITGGAC^ 

MMMMM lllllll I llllll III 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 IMMIM I 

TAGCTCAGCraCTCAGGaTCCCGCAAGC^ 

TaGCt CAGCTGCTCcGGaTCCC - CAaGCCaTCTTGGAcATGATCGCTGGtGCcCACTGGGG 



AGTCCTaGCGGGCATAGCGTATTTcTC 

llllll MMMilMMIMM lilllllllll IMIIIIIIIIIIIIM II III 

JUSTCCTgGCGGGCATAGCGTATTTtTCCATGGTGGG 

llllll llllll lilllllllll JIM II I Ml MM Ml I II llllll MMMMM 

AGTCCTAGCGGGCATAGCGTATTTCTCCATG^ 

1 I I I I 1 I I 1 I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I 1 1 I I I I I I I 1 I I 
AGTCCTAGCGGGCATAGCGTATTTCTCCATGGTGGGGAAC^^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMMM 

AGTCCTAGOGGGCATAGCGTAriTCrc 

M II I II II M I M I I II II I I II II II II II I I II I II II I I II M II I II I II II II 
AGTCCTAGOGGGCATAGOGTATTTCTCCATGGTGGG 

1 1 1 1 1 1 i 1 1 E 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMMMIIMMIMM II I II 

AGTCCTAGCGGGCATAGCGTAT^TCTCCATCGC^ 

1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 E 1 1 1 1 1 1 1 IMMMM M I MIMi MM Mill 

JUSTCCTAGCGGGCATAGCGTAT^TCTC 

AGTCCTaGCGGGCATAGCGTATTTcTCCATGG t GGGgAACTGGGCGAAGGTC cTg gTaGTg 
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FIGURE 1A 
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Minimi iiiiiiii inn 

TGCi'UCnVl'l'l'ljCCGGCGTCGACGC 

iiiiiimmiiiiiiiiiim 

TGCTGCT3VITTGCCGGCGTCGACGC 

Jj.Jil.LlI li[!iLLll,lLi[ 

iTitl ^T|]jT m?m]?^ 

iTmsinriNi^^ 

TGTTGCTGTITgCCGGCGTOGATGC 

miiiiiiil iniiniiiiii 

CTGTTGCTGTTTtCCGGOGTCGATCOG 
CTGtTGCTgTTtgCCGGCGTcGAtGCG 
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1 TJm5JUM?TGCGaVAC CT Cnra«3G^^ 

I 



inn 

TCTACCAtGTCACgM 

lllllll Mill II 

IGTACCAaGTCACcAA 

II 



1 1 1 1 



lllllll Mil 

rCCAACTtAAGCA 
Mill 



lllllllllll I 



1 TATCAAGTGCGCyUVCGTGTCCGG<XnTTiaC 

I 



nun mill i 

KTGAAGTGCGCAACGTgTCOGGGGI 

Mill IMIIIIM II lllllll 

TAXGJUUnXXZGCAACCTaTCCGGTCcGTJ^C^^ 
tACGAaGTGCgCAACGTgTCCGGGgtgTAccAtGTaVCgAAc^CTGc^rcCAACTcaAGca 
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3CGTTCGGGA 

Mill MINI lllllllllllllllllllll! IIIIIIIIIIIIIIIIIMIII 



mini ii iii 1 1 1 1 1 1 1 1 1 1 inn iniiiii ill 

TCX?Kn7iTGAaACftGCGGACATC^Tt!VrGCaTAC CCCTGGATGCaTGCCCTGCGTTCGGG 

1 1 1 f 1 1 1 1 1 1 inn iniiiii iiiiiiiiiiiiiiiiii minim mi 

TCG'ltjl'ATGJUSACAl^aGACATGflTCftTGCATACCC 

I IMII MIIIUII llllllllllllll llllllll 
TTGTSTATGAGACAGCGGACAatSATCATGCAcACCCCTBG 

M 1 1 1 1 1 1 1 1 1 f 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 M 

TTGTGT3VTGAGACAGCGGACATGATCATGC* 

Minium iiiiiimiii in imm 



62 TTGTGTATGAGGCAGCSGACATGATCATGCAQlCtCX^GGGTGCGTGCCCTGCSTTCGGGA 



I 111 111 ! Ill MM 

TTGTGT7lTGAGGCAGtGGACgTGATCCTC(JACACL"Ui; 

I I ii ii ii ii i mi mi inn 1 1 1 1 1 1 1 ii 

TTGTGTRTGAGGCAG CGGACATGATCATGCACAC t CC WaUUXUUa 1.-TU CUX-lULKiO 

iiiiiii 1 1 1 i 1 1 1 1 1 1 1 1 ! f 1 1 1 1 1 1 1 1 i i ii iiiiiiiiiiiiiim ii mi 

TTGTGTftTGMGCRGCGGACATGATCATGCAtACCCCCGGGTGCGTGCCCIXScGTcCGGG 

1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill I M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 II Mil 

TTGTGTATGAGGCAGCGGACATGTVTaATGCAcACCCCCGGGTGCGTGCCCTGtGTTCGGG 

iiiiiii iiiiiimiii mi inn iimiiim ii nm mm n 



TtGTGTatGAggCRgcgGACaTGATcaTGCAcACcCCcGGgTGcgTgCCCTGcGTtCgGGA 
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FIGURE IB 



123 Ga3.CAACcaCTCCCGCTGCTGGGTAGCGCTCACcCCCACGCT 
123 GGgCAACTCCnlxiCGCTGCTGGGTAGCGCT^ 
123 GGACAACTCCTCTCGCTGCTGGGTAGCGCT^ 



123 GGACAACrCCTCTCGCTGCreGGTAGCGCTCACCCCCACGCTC 

IIIIIIIIII II II llllllllllll I II llllll II I I I II Mill I II I 
123 aAACAACTCCTCCCGTTGtTGGGTAGCGCTCgCCCC 

IIIIIIIIIIMIIIII Mill llllll I Mill IIMIIII I Mill I II 

123 GAAC^CTCCTCCCGTTGCTGGGTgGCGCTCA^ 

1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 II Mill 1 1 1 1 1 f 1 1 1 1 M II II 1 1 1 M 1 1 Mill Mill 

123 GAACAACTCCTCCCGCrctTGGGTAGCGCTCACTCC 

II Ml 1 1 1 M 1 1 1 1 1 1 MMIMIMI IIMIIII II I III I Mllllllll II 

123 GAgCAAtTCCTCCCGCTGCTX3GGTAGCGCT1ACTCCCACGCr 

I III III II IIIMIIIIMIMI 1 1 1 MMIMIMI II MMIIMI MM 

123 GGcCAACTCCTCCCGCTGCTGGGTAGCGCTCACTCCC^ 

II Mill Ml I IIIIIIIMII I IMIIIIMII II II II llllll III 

123 GGGCAACTtCTCTaGtTGCTGGGTAGCGCTCACT 



123 GGGCAACTCCTCTCGCTGCTGGGTAGCGCnX!ACTCCC^ 

123 cLuJaicnrcrrc 

III II 1 1 1 lllllll IIMIIII 1 1 M 1 1 1 ! ! 1 1 [ 1 1 1 ! 1 1 1 i 1 1 1 1 1 ! 1 ! I M 1 1 1 

123 GAAOlACTCCTCCCGtTGCTGGGTgGCGCT 

123 GAACAAtTCCTCCCGcTGCTGGGTAGCGCTCACTCCCACGCT 

llllll Mill III II MMIIMI II IIIIIIIIII Mill (MM MINIM I 

123 GAACAACTCCTCCCGtTGCTGGGTAGCGCT^ 

1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 MMIIIIIIMM IIMIII II IIMIIII Mill MM 

123 GPACMCTCCTCCCG<ttCTGGGTAG 

123 GggtJJJrrccrc 

gaacAActcCTCccgcTGcTGGGTaGCGCTcaCtCCC^ 
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FIGURE IB 



184 aTCCCCACrrACC^CaATACGACGCCM^ 

I ! ! i ( ! ! ! 1 ! ! i ! ! I ! 1 1 ! ! 1 1 1 1 1 1 ! 1 1 1 ! I 

184 GTCCCCACTACGACgATACGACGCCATGTC^^ 

MIMIIIIIIIII 1 1 1 1 1 1 1 1 1 1 1 II 111! II II II II 11 II II NH ^'.''^ '' 

184 fmiiiinii i 

164 CTCCCCACI3U3XX:gftp££Aro 
184 

III 

184 GTCCCCAO 

llllllllllllll 
184 GTCCCCACCACGACAA' 

I I I I I I I 1 

184 

184 <^CCCACTAC(^C^A^ 





184 <m:CCCACCACGACAATACGA^ 

III llllllllllllllllll 
184 GTCtCCJ^CACGACJUlTACGACaCaXCGTCG^ 

III Mil 
184 GTCCCCAC 

II lill I! IIMIIIII1MMI 

184 CjIOZCaVCTACC^CAATACGACG 

iiiiiiiiiiiiiniiiiiiii : 

184 GTCCCCAro^CGACAATACGACGCCACGTCGATTTC 

1 1 III II Mill II II II II III 1 1 Mill II II llllllllllllll 1 1 1 1 II I I " 

184 aTCCCaumxCGAOUlTACGACGCC^TGT^ 

MM MUM 

184 GTCCCaACTACGgCAATACGACXaCCAT^ 
184 GTCCCcACclMa 

gTCcCcACtAcGaCaATACGACgcCAcGTCGAtTTCCT 
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245 CCKCA3CTACGTGGGGGATCTC 



245 CCCXnTlTGTACGTGGGGGATCTtTCCG 

245 cx^crlro^ 



245 CaXTTAIOTACGTGGGGGATCTCTCCGGATCW 



!! lllli III! 
rAtGTGGGaGACC 

Mill II I 



1 1 1 1 1 1 



245 CCGCTATCTACGTKXra^TCTC^ 

245 CCStTATGTACGTGGOT 
II 



III 
CGC( 

III 



Ml M M II 111 M ill 

CTaTGCGGATCTGTTlTCCT< 

(I iiiniillliiillli 



ill 

CGCi 
III 



II llllllll I 

SAcCTCTGCGGqTCcGTTTTCCTCaTCTXICCAGCTGTTCACCT 

II llllllll II II MINI llllllll lllllllll 

GATCrcroCGGATCXCTcOTCCTCGTCTCCCAGtTGTTCACCT 

i imiiiiiiiii iiiniiii 



CCGctATGTAcGTGGGgGAtCrcflCCGGaTCtGTttTCCTcgTcrrCcCAGCTGTTCACCtT 
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FIGURE IB 



Mill II 



II 1 1 1 1 1 1 1 1 1 1 1 1 1 f II II Mil IIIIIIMIIIIIIIIII1 

CACGAGAC«3TACAQ3ACTCCAACTGCTC^ 

306 tXaSCCTCXGCGACACra 

CTCGCCTCGCOGACAI 
MIMMIMM II 



II 

„_____-___-_-__-____ ^ ^ JtTC 



IN II MUM IIMIIIIIIIIIIIIIII M MIMI I t ! 1 1 f 1 1 1 1 1 1 ! I III 

f^Tr^rr^YV^CGGtRTCAGAO^CTACAGGACTGCAATTGCTCAA^ 

III IIIIIIIIIIIIIIIIIIMII II 1 1 1 1 1 HI 1 1 III 1 1 1 III 



306 CTCGCCTCGTCaGCATGAGACAGTACACXIMTrc 



III 1 I I I I 1 t t I I I I III lllllllllllllllll M I i I I I I I I I Ml Mill 
CTCaCCTOTTCGGCATtgGACAGTACAGGRCTGCAATTGtTCAATCTATCCt 

in ii ii 1 1 1 1 1 1 iiiiiiiiii Minimi iiiiijLiiiii ii Mill 



cTCgCCtCGcCggcAtgaGACagtaCAGgAcrrcc^crrccrrCaaTCTATC 
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FIGURE IB 



367 TCAGGTCACCGCATCGCTTGGGAtATGATGATGAACTC 
367 TCAGGTCACCG»TGGCTTGGGAcATG^^ 
367 JUZAGGTCACCGCATGGCTTXXKjA^ 

lllllllllll Mil MIIIIIIM III II IMIIMI II MINI Mill HUM 

367 ACAGGTCACCGtATGGCTTGGGATATGATGA^ 
367 AaLsGK^recI^ 

367 TCAGGTCACCGCATGGCTTGGGAraTGAIG^ 
367 TCAGGTC^CGCATOGCITGGGATAZ^^ 

MIIIIIIM Milllll lllllllll IIIIIIIIMI! II III I M 1 1 1 1 f 1 1 

367 aCAGGTCACCGtAItKKTITGGGATATG^ 

iiiiiiiiM milium mmmmiiiii 11 m i 11 n mi 

367 TCAGGTCACCGCATCG CTTGGGAcATGJVTCATGAACT 
367 TCMGTCACCGCATGGCTTGGGATATG^ 

1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 It ! 1 1 1 1 1 1 1 i 1 1 1 1 

367 TCAGGTCACCGCATGGCCTGGGATATGATGATGAACI^ 
367 ACAGGTCACCGCATGGCTTGGGATAI^ 
367 ACAGGTCAC03CMTOCTIX^3GAT^ 
367 TCAGGTCACCGCATOXrrTGGGJ^^ 



367 TCAGGTCSiCCGCXrGGCrrTGGGJn 

367 TCAGGTCACCGCATGGCTIXKKiATATGATGAT^ 

1 1 i 1 1 1 1 IMIIMI MIIIMI III MMIMMMIMI III MIIMIMI 

367 aCAGGTCAtCG<^TGGCcTGGGATATGA^ 
tCAGGTCAcCGcATGGCtTGGGAUVTO 



WO 55/01442 



12/47 



PCT/US94/07320 



SKO ID NO: 


Isolate 


11 


riei 


24 


T10 


10 


D3 


9 


Dl 


14 


HK5 


15 


HK8 


12 


HK3 


23 


T3 


22 


SW2 


17 


IND8 


16 


IND5 


21 


SA10 


20 


S45 


25 


US 6 


13 


HK4 


18 


P10 


19 


S3 


9-25 


consensus 



FIGURE IB 



428 TaTCGCAGTTACTCCGaATCCCAOXAGCTGTC^ 

I I Mi I ! ! ! I illil lllllllililllll lllllllllll I MIIIIIIIIU 

428 TgTCGCAGTTACTCCGGATCCCACAAGCTCT 

428 bdroau^ 



428 TATCGCAGTTACTCCGGATCCCACAAG^ 
428 TCTC^MctIct 

M 1 1 1 1 M 1 1 1 M 1 1 1 i 1 M 1 i 1 1 M 1 1 1 llllll II INI I lllllllililllll 

428 Ttm^CAGTTACICO^TCCCGCAA^ 
428 l^m^O^TO 

iiiiiii ii iiiiiiiiiii in mini nut in n iiiiiiiiin in in 

428 TCTCGCAGTTgCTCCXXsATCCCACAAGCTCT 

| MIJIIIl II I II II I Ml IN I I MM II fl I Ii 1 II I I lllllllililllll" 
428 TATCGCAGTTaCTCCGGATCCCACAAGCTGT^^ 

428 tItMMgTTGCT^ 



428 TATCGCAGTTGCTCCGGATCCCACAAGCT 

llllllllll IIIIIIIIIMIIIIIII IIIIIII 1 1 1 i 1 1 1 1 f ! 1 E I I I 1 I I I 1 I 
428 TATCGQVGTTACTCCGGATCCCACAAGCTaT^ 



428 TATCGCAGTTACTCCX3GATCCCACAAGCTC 
428 l^MaL^^ 



428 TATCGCAGTTACTCCGacTCCCACAAGCTGTCATGGACAT^ 

I llllll IIIIIII lllllllllll II UN llllllllll IIIIIIIU" 
428 TgTCGCAGCTACTCCGGATCCCACAAGCTaTCtTC 

428 TaTTOC^cricr 

TaTCGCAgtTaCTCOTgaTCCCaCAAGCTgTCgTGGAcaTGGTggCgGGgGCCCACTGGOT 



WO 95/01442 



13/47 



FCT7US94/07320 



5EO ID WO: 


Isolate 




11 


DKl 


489 


24 


T10 


489 


10 


D3 


489 


9 


Dl 


489 


14 


HK5 


489 


15 


HK8 


489 


12 


HK3 


489 


23 


T3 


489 


22 


SW2 


489 


17 


IND8 


489 


16 


IND5 


489 


21 


SA10 


489 


20 


S45 


489 


25 


use 


489 


13 


HK4 


469 


18 


P10 


489 


19 


S9 


489 


9-25 


consensus 





FIGURE IB 

489 itf^CTCCXSSGKCTc^^ 

1 ! i ! ! I i I 



Mill 



MINI llllllllllillllllllllllllllll 

AGTCCTGGCGGGCCTTGCCTACTATTCCATGGTGGO 
489 JuSTCCTSGGsLsCCTTK^^ 



lllllllllll 
AATCCTGGCGGGCCTTCCCBVCTATTCCATGGl 

I llll Ill I IIIIIIMIIIIIIIIIIIIII 



489 AGTCCTGGCGGGCCTTGCCTACTATTCCAXt^nXjiGGGAACTGGGCTiU^GGn 



Mill llllllllllllllllll 1 1 IJLJJ 

AGTCCTaGCGGGCCTTGCtTACTA^ 

mini 1 1 1 1 1 1 1 1 f 1 1 iiiiiiiiiiiiiiiiiniiiiiii inn 1 1 1 1 mil 

lllllilll 



agTCCTgtfSCGGTCCTtGCcTACTAtTCCATGGtg^ 
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550 CTGCTACTCTTTGCCGGCGTTGATGCG 

II llllll!!!! 

550 ATGCTACTCTTTGCCGGCGTTGATGGG 

IIIIIIMIIIIII Mill II II 
550 ATGCTACTCTTTGCTGGCGTcGACGGC 

llllll fltlltl I I I I I I I llllll 
550 ATGCXACTCTTTGCTGGCGTTGACGGC 

iniiiii inn mum n 

550 ATGCTACTtTTTGCCGGCGTTGATGGG 

INIIIII llllllllllllllllll 

550 AroCTACTgTTTGCCGGCGTrGATGGG 

iniiiii llllllllllllllllll 

550 ATGCTACTtTTTGCCGGCGTTGATGGG 

I I I I I I I llllllllllllllllll 
550 CTGCTACTCTTTGCCGGCGTTGATGGG 

ii mi mini iiimii m 

550 ATGCTACTCTTTGC tGGCGTTGACGGG 

IIIIIIMIIIIII i I M I 1 1 1 1 1 I I 
550 ATGCTACTCTTTGCCGGCGTTGACGGG 

1 1 1 1 1 II I M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 

550 ATGCTACTCTTTGCCGGCGTTGACGGG 

IIIIIIIIIIIIIIMIIIIIIIIIII 

550 ATGCTACTCTTTGCCGGCGTTGACGGG 

I I I 1 I I t I 1 1 I I I I I I I 1 I I I I I I I t I 
550 ATGCTACTCTTTGCCGGCGTTGACGGG 

lllll l llllll ll llll IIIMIII 
550 CTGCTACTCTTTGCCGGCGTTGACGGG 

IMMI IIIIIIIIIIIMIIIIIII 
550 ATGCTACTCTTTG CCGGCGTTGACGGG 

III MM I Mill II MIMIIMM 

550 ATGCTACTCITTGCCGGCGTTGACGGa 

IMIMM I I II I II llllllll 
550 ATGCTACrtTTTGCtGGtGTTGACGGg 

aTGCTACTcTITGCcGGcGTtGAcGGg 



WO 95/01442 



15/47 



PCT/US94/07320 



FIGURE 1C 



SEP TP NO: Isolate 
26 T2 



27 
28 
29 
26-29 



T4 
T9 
US10 
consensus 



GCcCAAGTGAgGAACACCAac C gCgG tTACATGGTGAC tAACGACitjTTCc^UVTGAgAGCA 

I! IIIINI MINIM' I I lllllllllll IMIMIMM Mill Mil 

GCaOUUTIKSAAGAACACCAcTAaCM 

II lilllllimilll II IIMMIIMII! II II llllllll II lllllll 

GCCgAAGTGAAGAACACaUTI^ 

I | lllllll lllllllllllllllll llllllll llllllll lllllllllllll 
GtCcAAGTCAAaAACACO^ 

GCC cAAGTGAagAACACCAgt acCaGdTAcATGGTGACcAA- GACTGtTCcAA- GAcAGCA 



gEQ IP WO: 
26 

27 

28 

29 

26-29 



^solape. 
T2 

T4 

T9 

USXO 

consensus 



62 TCACcTGGCAGCTCC£aGCCGCGGTtCTCCA£ 

Nil lllllllllll llllllll llllllllll lllllll I lllllll Nil 
62 TCACtTGGCAGCTCCAGGCCGCGGTCCTCCACGTC^ 

(III lllll IMMMMIIIIII IIMIIIMIMIMII lllllllllllll I 

62 TCACcOGGCAACTCCAGGCCX^GCTCCTC 

mm iiiiiiii mi nun niiiiM iniiiii iiiiiiiiiiiii in 

62 TCACtTGGCAACTtgAGGCtGCGGTCCTCCACGTtCCCGGGTCtGTCCCGTC 

TCAC - TGGCA- CTecAg^cGCGCTcCTCCACCT^^ agt 



IP HO: 



26 
27 
28 
29 
26-29 



Isolate 
T2 

T4 

T9 

U510 

consensus 



123 GCXaJUUlTACATCcCGaTGCTGGA^ 

llllllllllll II llllllllllllll lllllllllllllllllllllllllllll 

123 GGGAAATACATCtCGGTGCTGGATACCGGTtTCACCAAAC^ 

lllll I II lllllllllllllllll II llllllll II Mil II III III 

123 tGGAAAcgCqTCgCGGTGCTGGATACCGGTCTC 

iiiii i ii iniiiiiiiiiiiiiiiii inn ii ii iiiiiiiiiiiini 

123 gGGAAAtaCaTCtCGGTGCTGGATJ^CGGTCT 

gGGAAAtaCaTCtCGgTGCTGGATACCGGTctCaCCAAAcGTgGCcGTGC - GC - GCC-GGC 
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184 GCtCTtACGCAGGGCTTGCGGACGCACATcGACATGGTO 

II II llllimillllllllllllll 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 ! 1 1 1 

184 GCCCTCACGCJUy-teCVltjCGGACGCAC^ 

lllllllllllllllllllllllllllll lllll ill it. ii. 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

184 GCCCTCACXjC^UXKSCTTGCGGACGCACATCGA^ 

lllllllllllllllllllllll llllllllllllll llllllllllllllllllllll 

184 GCCCTCACGCAGGGCTTGCGGACtO^CATCGACATGGTcGTG^ 
GCcCTcAOGCAGGGCTTCCGGACgCACATcGACMGCT 
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27 

28 

29 
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Isolate 
T2 

T4 

T9 
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consensus 



245 CTGCcCTcXACGTGGGGGACCrCTGCGGCGGGGIX^TGCT 

MM II !! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 M 1 1 1 1 1 f 1 1 1 1 1 1 

245 CTGCTCTtTACGTGGGGGACCTCTGCGGCGGGGTGATGCTCGCAGC 

I Mill lllllllllll 1 1 1 1 1 1 1 1 1 1 1 1 1 1 llllllll II M 1 1 II II 1 1 1 I 

245 CCGCTCTcTAGGTGGGGGAtCTCTGCGGCGG 

lllllll MIIMIIMI lllllll Ml I llllllll II II 1 1 1 1 f 1 1 1 f I 

245 COGCTCT tTACGTGGGGGAc tTCTGCGG tGGGaTgATGCTCGCaGCcCAaATGTTCAXTgT 
C-GCtCT-TACGTGGGGGAccTrCTGCGGcGGGgTgATGCTCGCa^ 
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SKO ID NO: 
26 

27 

28 

29 
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con sen mi s 



306 CTCGCCGCgACgcCACTOTTTTGTGCAAG^ 

liiinii ii Miiiiiiiiiiiiiii iniiiiiiii iiiiiiii i: iiiiii 

306 CIXXSCCGCAACAtCACTCGTTrGTGCAAGAcra 

lllllllll II MM I I 1 1 111 I II II Mill Mill II IIIIIIII IIIIII 

306 CiracraOlgCACCACTCGT^ 

llilllll lllllll Mil Mil 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill lllllllll 

306 CTCGCCGCgcCACCACTcGTITGTGCAGGAATGCAACTGCT 

CTCGCCGC - aCacCACTgls IVltilXJCA- GAaTGCAA- IGClCcATcTACCC -GGtACCATC 



26 
27 
28 
29 
26-29 



isolate 
T2 

T4 

T9 

US10 

consensus 



367 ACIXjGACACCGTATGGCATGGGAcJVTGATGATGAAC^ 

II Ml M IMMM IMMMM 1 1 1 1 1 N 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 Iilllllllllll 

367 ACTGGACACCCn^TGGCATGGGAtAT 

II MMMM MIMIMIMIi 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 lllllllllll 

367 AGTGGJICACCGTATGGCATGGGACATGATGAXGAACT 

II II I MMIMIMI IMIM II Mill II MUM MMMM Mil Mill 

367 ACcGGgCACCGTATGGCATGGGACATCA'^^ 
ilCtGGaavCCGTATGGOat^GAc^ 



SEQ ID NO: 
26 

27 

28 

29 

26-29 



Jso?,ate. 
T2 

T4 

T9 

US10 

conBensus 



428 TGGCGTAOTOGATGCGCGTTCCCGAGGTCMCaTAfiACATCaT^ 

IIIIIIIIMIIIIIIIIIIIIIIIIIIIIII IIIIIIII I lllllll llilllll 
428 iraCGTACGCGATGCGCGTTCCCGAGGTCATC^ 

IIIIIIIIIIIIIIIIIIIIIIMIIIIIIII IIIIIIII I Mill II IIIIIIII 
428 TGGCGTACGCGATGCGCGTITCCCGAGGTCATCATAGA^ 

lllllllll IMMM MMMMI IMMMM MM M II Mill II II Mill 

428 TGGCGTACXatGATCCGCGTTCCCGAGGTCAT 

TGGCGTACGcGATGCGCGTTCCCGAGGTCATCaTAGACATCaT - aGCGGgGC t CAdTGGGG 



SEP IP NO: 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 

T9 

US 10 

consensus 



489 CGTCATGTTtGGCITGGCCTACTrcrCTA^ 

lllllll II MIMIMIMMMMMI IMMMM I MM MM III IIIIIIII 

489 CGTCATGTTCGGCTTGGCCTACITCTCTATGC 

Iilllllllllll I I MMMMI MM Mill I MM MM MM llllllllllll 

489 CGTOOlGTTCGGCcTAGCCTACrrOT 

Mil IIIIIIII MMIMIMM MM I MM IMMIMMMI IIIIIIIIIMI 

489 CGTCtTGTIXXratTAGCCTACTTC^ 

CGTCaTGTTcGGCtT - GCCTACTTCTCTATGCAGGGAGCGTGGGCGAA- GTCgTTGTCATC 



SHO ID NO: 
26 

27 

28 

29 

26-29 



Isolate 
T2 

T4 

T9 

US10 

consensus 



550 CTctTGCTGGCtGCTGGGGTGGACGCG 
II lllllll MMIIIIIMIMI 
550 CTtCTGCTGGCCGCTGGGGTGGACGOG 

II Mil lllllll lllllllll 
550 CTgtTGCTcaCOGCTGGcGTGGACGCG 

II I I I I lllllll lllllllll 
550 CrtCTGCTagCCGCTGGgGTGGACGCG 

CTt - TGCTggCcGCTGGgGTGGACGCG 



WO 95/01442 



17/47 



PCT/US94/07320 



FIGURE ID 



SEP ID WD: Isolate 
33 T8 



30 
32 
31 
30-33 



DKB 
SW3 
DK11 
consensus 



1 GTGGAAGTtAGaAACAcCAGTT t tJ^CTACTACGCCACCAATGATTGCTCgAACAACAGCA 

ii ii ii 1 1 ii mi in:: 1 1 1 1 1 1 1 1 1 1 1 1 ! i ! 1 1 1 1 1 1 1 1 1 1 1 iii ii mi i 

1 (nT3GAACTCAGGAACATCAGTTCcAGCT 

IIIIIIIIIIIIIIIIIIIIII! Illlllll IIIIIIIIIIIIIIIIIIIIII IIIII 

1 GTGGAAGTCAGGAACATCAGTTCTAGCTACTAtGCC^ 

IIIIIIIIIIIIIIII IIIIIIIII IIIII IIIIIIIIIIIIIIIIIIIIII IIIII 

GTGGAAGTcAGgAACA- CAGTTctAGcTACTAcGCCACCAATGMTGCTCa^ 



SKO ID WO: 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DK8 

SW3 

DK11 

consensus 



62 TCACCTGGCAqCTCACCaACGCAGT 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 IIIIIIIII IIIMMMIIIMM IMIMIMMIIIMI I 

62 TCACCTGGCAACTO^CgACGC^GTTCTC 

lllllllllllllllll Mlllll 1 1 M 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M lllllllllll 

62 TCACCTGGCAACTCACCAACGC^GTcCTCCACC^^ 

lllllllllllllllllllllllll IIIIIIIIIIIIIMIIIIIIII lllllllllll 

62 TCACCTGGCAACTCACCAACGCAGTtCTCCACCTTC 
TCACCTGGCAaCrCACCaACGCAGTtCTCCACCrrC^ 



SEP ID WO: 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DKB 

SW3 

DK11 

consensus 



123 CAAIOTCACCtTGCGCTCCTGGATACAAGTaA^^ 

IIIIIIIII! I T 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 M ! 1 1 1 1 II M 1 1 1 f 1 1 III 

123 CAATGGCACCCTGCGCTGCTGGA!!AjCAAGTGACACC 

Mill Mil II II lllllllllll II 1 1 IIIII Mi I III 1 1 II Mill II 1 1 UN 1 1 

123 tAATGGCACCCTGCACTCXrTCXIATACAAGTGAQlC 

MMIMMIIIIIIIIMIIIMMMMMIIMMMMMIIMMMIIIIIIII 

123 cAATGGC^CCTGCACTGCTGGATACAAGTO 

CAATGGCACCCTGC- CTGCTGGATACTlAGTgACACCT^ 



SEQ ID WO: 
33 

30 

32 

31 

30-33 



isolate 
T8 

DKB 

SH3 

DKL1 

consensus 



184 GCACTcACTCAcAACCTGCGAACgCAtGTrCGACGTGAT^ 

Mill IIIII lllllllllll II llllllllllllllllllllllllllllllllll 

184 GCACTtACTCAtAACCTGCGAACACACXSTCGACGPK^ 

II II Mill IIMIIMI 1 1 M 1 1 II 1 1 1 1 ijj ' ill'JJ J ' ' ' ' ' 1 ' M ' 'JJ^,! 

184 GCqCTCACTCACAfiCCTCiCGAGCACJVCGTCGJOTVrcATOGTW 

II f IIIIIIIIIIIIIIIIIIIIII I Illlllll IIIIIIIIIIIIIIIIIIIIII 
184 GCaCTCACTCACAACCn^CGAGCAC^taTaGATATGAT^ 

GCaCTcACTCAcAACCTGCGA- CaCA- gTcGA- - TGATcGTAATGGCAGCTACGGTCTGCT 



SBO ID WO: 
33 



30 
32 
31 
30-33 



rpolate 
T8 

DK8 

SW3 

DK11 

consensus 



245 CGGCCTTGTATGTGGGgGACGTqTGCGGGGCan^ 

1 1 1 1 1 1 1 1 1 1 1 f I IIIII 1 1 1 lllllllllllllllll I MINIUM lllllll 

245 CGGCCTTGTATGTGGGAGACGTaTGCGGGGCCGTGATGATCGTGTra 

Illlilllllllllllllll I lllllllillllllllllllllllilllll IIMIII 
245 CGGCCTTGTATGTGGGAGACaTGTGCGGGGCCGTGATGATCGTCT 

Illlilllllllllllllll IMMMIM Illlllll II I! IMIMMMMI Ml I 

245 CGGCCTTGTATGTGGGAGACgTGTGCGGGGCCGTGArGATC 
CGGCCTTCTATGTGGGaGACgTgTGCGGGGCOSTGATGATc^ 



WO 95/01442 



18/47 



PCT/US94/07320 



FIGURE ID 



SEP TP WO: 
33 



30 
32 
31 
30-33 



Isolate 
T8 

DK8 

SW3 

Diai 
consensus 



306 ATCGCCaGAACGCCACAACTTcACCCAGGAGTGCAACTGTTC CATCTACCAAGGTCATATC 

1 1 i 1 1 1 IMMMMMM! 1 1 1 II 1 1 i 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 

306 ATCGCCtGAACGCCACAACTTTACCCAGGAGTGCAACTGTTCC^^ 

IIIIII I II IMIMI II Mill M I lllllllllllllllllilllllliill 1 1 1 1 

306 ATCGCCAGAACGCCACAACTTTACCCAAGAGTGCAACTGT^ 

lllllllllll II 1 1 MIMII lllllllllll IIIIMIMMII II 1 1 MM III 

306 ATCGCCAGAACaCCACcACTTTACCCAAGAGTGCAACTC 

ATCGCCaGAACgCCACaACTT tACCCA - GAGTGCAACTGTTCCArCTACCAAGGTCatATC 



SEP ID NO: Isolate 
33 T8 



30 
32 
31 
30-33 



DKB 
SW3 
DK11 



367 ACCGGCCACCGCATGGCATGGGAjC^TGATGCTgAAC^ 

MMM III MMM II II MMI Mill Ml lllllllllllllllll llllllllll 

367 ACCGGCCACCGCATGGCATGGGACATGATGCTAAACTGGTCA 

lllllllllllllllll Mill IIIIII IIIIII llllllllllllll II III IMIMI 
367 ACCGGCC^CCGCATGGCgTGGGACATGATGCTAAACT 

I Mill IIMMIMM llllllllllllll lllllllllllllllll I I 1 ! I I f t I 1 
367 ACCGGCCACCGCATGGCaTGGGACATGATGCTtJUlC^ 

ACCGGCCACCGCATGGCaTGGGACATGATGCTaAACTG 



SEP ID WO: 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DKB 

SH3 

DK11 

consensus 



428 TCGCCTAcGCtGCTCGTGTgCCTGAaCTAGtCCTtg^ 

IMIMI II lllllill Mill MM III I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

428 TCGCCTATGCCGCTCGTGTTCCTGAGCTAGcCCTccAg^JITGTC 

I 1 I I I I 1 t 1 I I I I I 1 I I t 1 I I I I I 1 I 1 1 I III I I 1 I I I I I i I I I I I 1 I 1 I t I I f 1 1 
428 TtGCCTATGCCGCTCGTGTTCCn^CTAGTCCTTGA 

I Ml Mill III III MMM I II II II Ml II MM I llllllil II I Ml II II 

428 TcGCCTAJtreCGCcC G TOIT C CI^^ 

TcGCCTAtGCc*3CtCGTGTtCCTGAgCTAGt^ 



SEP TP WO : 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DKB 

SW3 

Dtai 

consensus 



489 



489 



489 



489 




3CTTGGCCTATTTCTCCATGC7teGGAGCGTGGGC 

IMIIMIMMIMIMM IIMMMMIMM Mill MMM 

3CTTGGCCTATITCTCCATGCAgGGAGOT 

I I I I II I II I I II II I I I I I M I I I II M I II I llllllllllllll t I I I I I I I I I t I 
CGTGGTGTTTGGCrraSCCXATTTCTCCATC 

1 1 II 1 1 1 II I II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 ! I [[ I II II 1 1 II 1 1 

tGTGGTGTTTGGCTTGGCCrArrTCT 

rATTTCTCCATGCA - GGAGCGTGGGCCAA - GTCATtGCCATC 



$EQ ID WO: 
33 

30 

32 

31 

30-33 



Isolate 
T8 

DK8 

SW3 

DK11 

consensus 



550 CTCCTcCTTGTCGCAGGAGTGGAcGCA 

Mill II I I II II I I I II II I I III 
550 CTCCTtCTTGTCGCAGGAGTGGATGCA 

DIM I III IMMIMI M M MM 

550 CTCCTgCTTGTCGCAGGAGTGGATGCA 

Mill Mill IMIIIIIIIIIIII 
550 CTCCTtCTTGTaGCAGGAGTGGATGCA 

CTCCTtCTTGTcGCAGGAGTGGAtGCA 



WO 95/01442 



19/47 



PCT/US94/07320 



FIGURE IE 



SKO ID NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



tTAGAGTGGCGGAATCTGTCcGGCCTCTAcGTCCTT^ 

I ill II II Mil litllli 1 1 1 1 1 1 1 1 Ml Mil II 1 1 1 MM 1 1 1 1 1 1 1 1 1 1 1 1 E I 

CTAGAGTKCGGAATGTGTCTGGC 

MMIMMIIMM IMMMMMMIMM 1 1 1 1 1 1 1 1 INI M I M 1 1 1 1 1 M 

CTAGAGTGGCXXaAAXACGTCTGGCCTC13lTC 

MMI MM MMIMIMIMMM MM Mil 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

CTAGAGTGGCGGAATACGTCTCGCCTCTATaTCCT 

I! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 IMMMI MMMMM I MMMI MM 

CTAGAGTCTCCGGAATACGTCTGGCCTCE^^ 



SKO IP NO: 
35 

36 

37 

39 

38 

35*39 



Isolate 
DBC12 

HK10 

S2 

S54 

S52 

consensus 



62 TcGTGTATGAGGCCGATGACGTCATTCITjCAGACAC^ 

I I I M I I I I I I I I I I I I I I 1 1 I I I I 1 I I I M ! 1 I 1 i I I I I I I I 1 M I I 1 I I I I II I II I I 
62 TlXjlXjrATGJVGGCCGATGAC^ 

llllllllllllllllllllll III III 1 1 II I III I INI MM 1 1 111 Mil Mil II 

62 TI\51tri ATGAGGCCGATCACGTtATT^ 

II Ml MMMM IMIMMI 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 III I II 1 1 M 1 1 1 1 1 1 1 

62 TTGTGrrATGAGGCCC^^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 It 1 1 ! M 1 1 ! 1 1 1 1 1 11 1 M M 1 1 1 1 1 II M II M I M 1 1 1 1 

62 TTGTGTATGAGGCCGATGACGTCATTCTGCACAC^ 
TtGTCTATGAGGCCGATGACGTcATTCTGCACA^ 



SEP IP WO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



123 CGGCAATACATCtACGTGCTGGACCTCaGTGACgCCTACAGTGGC^ 

IMIIIIIIIII llllllllllllll Mill 1 1 1 1 1 1 1 ! 1 1 1 1 f M 1 1 1 1 1 1 1 1 1 1 1 1 

123 CGCXlAATACATCCACGTGCTGGACCTCgGTGACACCTACA^ 

Ml MIMIMIIIIIMMMII I 1 1 1 1 1 1 1 1 M 1 1 1 1 M 1 1 1 1 1 1 M 1 1 llllll 

123 CGG t^TACATCCACCTTGCTGGACCCCAGTGACACCTACAGTGG CAGTCAGGTAtCSTCGGA 

Ml ! ! 1 1 1 1 1 1 M II 1 1 1 1 1 1 ! 1 1 M 1 1 1 1 1 1 1 1 1 M 1 IMMMMMIM llllll 

123 CGGCAATACATCCACGTGCTGGACCCCAGTGACACCXACGGTGGCAGTCRGGTACGTCGGA 

llllllllllllll 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 i i 1 1 1 1 1 1 1 1 1 1 1 11 1 11 1 1 i 

123 CGGCAATACATCCAtGTGCTXKI^CCCAGTGA^ 
CGGcAATACATCcAcGTGCTGGACCcCaGTGACa 



SEP ID NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



184 GCAACCACCGCtTOGATAOGCAGTC^TGTGGACCIt3cTAG 

MIMIMIM 1 1 1 i 1 f 1 1 1 ■ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 li 1 1 II II I II llll II 

184 GCAACCACCGCcTCGATACGCAGTCATGrGGACCTC 

1 1 1 1 1 1 1 II 1 1 lllllllllllllllllllllll II llllllllllllll MMMI 

184 GCAACCACCGCTTCGATACGCAGTCAT 

i ii 1 1 1 1 1 n i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 ii li ii iiii mm mi ii in i 

184 GCAACCACCGCTTCGATACGCAGTCAltTTCGACCTATT^ 

I 1 1 1 f 1 1 f 1 f I I I I I I 1 I I I t I 1 I 1 I 1 1 I I I I I 1 1 I I 1 I I I 1 1 I 1 1 I I I I I 1 t t I I t 1 1 f I 
184 GCAACCACCGCTTCGATACGCAGTCATGTGG^ 

GCAACCACCGCtTCGATACGCAGTCATGTGGACCTatTaGTGGGCGCGGCCAC 



WO 95/01442 



20/47 



PCT/US94/07320 



FIGURE IE 



SEP ID NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

conscnffuff 



245 CTGCGCTCTACGTGGGtGATgTGTGTGGGGCCGTCTTCCT 

MlllllliMMI!! !!! ! 1 1 1 1 i il 1 1 1 1 1 1 1 1 i ! I !!!! 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

245 CTGCGCTCIACGTGGGcGATATCTGTGGGGCCGTCTTC 

I IIIIIMI Miilll [MINIM llillll ! 11 1 MM llliltl II MIMIIMI 
1 1| || 1 1| tl II III III Mill II IIIIM MII III M MIIII II II 1 II Mill II 

245 CTX^CTCTATGTGGGTGATATGTXntX^ 

IIIIIIMMIIIIIMIIIIIIIII llllllllllllllll IUIIIIII UMIIIIII 

245 CTGCGCnX ^ TGTGGGTGAIT^^ 

CTGCGCTCTA£*nTXX3tGAXaTGTGTGGGGCOT 



SEP ID BO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
SS2 

consensus 



306 imiriMiiiiiiiiiii iiiiiiiiiiiiiiiiiiiiiiiiinmiiiii m 

306 CAGACCaCGTCGCCATOUUlCGGTCCAGACCTGTAACTGCTCGCTC 

1 1 1 1 1 1 llllllllllllllllllllllllllllllilllllllllllllllMII III 

306 OWGACCTCGTCGCCAXCAAACGGTCCAGAC^ 

I IMIMI 1 1 lllill llillll I Mill Ml MIMII II MIIIMIIIMIlllllll 

306 CAGACCTCGTCGCCATCAAACGGTCCAGACCTGTAACnXSCT 

MINI M MMIIII MMMI MMIMM MIMM II MMI Ml III MIMI II 

306 CAGACCTCCTCGCCATCAAACGGTCCAGACCT 

CAGACCtCGTCGCCATCAAACgGTCCAGACCTGTi^^ 



SEP ID NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



367 TCAGGACATCGAATGGCTTGGGATATGATGATC 

MMMMIMMMIIMMMMIMMMMMMMMMIM 1 1 1 1 1 1 1 1 1 1 1 1 1 

367 TCAGGACATCGAATGGCTTGGGATATGATGATGAAT^ 

1 1 1 1 1 1 1 1 1 1 1 IMIIIMI II MMIMM I MMIIII I M Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 

367 TCAGGACAXCGcATGGCTTGGGATATGATGAT^^ 

I II III III II INI III III I III IMI 1 1 llllllllllllllll I! 1 1 II I II II 1 1 

367 TCAGGACATCGAATGGCTTGGGATATGATCATG^ 

" " " 1 1 1 " 1 1 1 " " 1 1 " 1 1 " 1 1 " 1 1 1 1 1 ' ! ' 1 ! J. 1 . 1 . 1 . 1 . 1 . 1 ! ' ! ' ' 1 ' ' ! ' 1 ! ' 1 

367 TCAGGACATCGAATGGCTTGGGATATGATGATGAATT^ 
TCAGGACATCGaATGGCTTGGGATATGATGATGA 



SEP ID NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 

consensus 



428 TaGCGCACGTCCTGOGtCTGCCCCAGACCITGTTCGAC^ 

I llllllllllllll IIIMI illlMIMIII MINI III! 1 1 1 M 1 1 II 1 1 1 1 1 

428 TGGCGCACGTCOTGCGqTTGCCCCAGACCTTGTTCGA^^ 

IIIIMMII lllll II 1 1 11 II II 1 1 IMIIMMIIIMIIIIIIIIIIIIIIMI 

428 TGGCGCACGTtCTGCGtTTGCCCCAGACCgTGTTCGACATAATAGCCG 

iiiiiiii i iiiii M 1 1 1 1 1 1 1 1 1 1 mi nun i i i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 

428 TGGCGCACATCCTGCGATTGCCCCAGACCTTGTTTGACATACTTC 

1 [ 1 1 i 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 H 1 1 II !l I 

428 TGGCGCACATCCTGCGATTGCCCCAGACCTTGTTTGACAT^ 

TgGCGCACgTcCTGCG - tTGCCCCAGACC tTGTTcGACATAaTaGCcGGGGCCCATTGGGG 



WO 95/01442 



21/47 



PCTYUS94/07320 



FIGURE IE 



SEP TP NO: Isolate 
35 DK12 



36 
37 
39 
38 
35-39 



HK10 
S2 
554 
S52 
consensus 



489 CATCaTt^gGGCCTAKCTATT^ 

Mil 1 1 1 1 I IMMIMMMMIMII MMIIIMM MMIMMII Mill Ml 

489 CATCTTGGCaGGCCTAGCCTfiTXACTCCATGCAGGGCAACTGGGC 

Mlllllll I I F I I I I I ! M I I I I I I I I I I I I I I I I I I I II 11 I I I i I f I I I I I I I I 1 II 
489 CATCritiGCGGGCViyiGCCiyUTACTCCATC 

1 1 1 1 1 1 j 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii inn iiimiiiiimiMiiiimiii 

489 CATCTTXXSCGGGCCIArcCTATIATTCra 

1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 n 1 1 1 1 H 1 1 i i ii 

489 C3VTCTT(XX!GGGCCTAGCCTATTAT^ 

CATCCTGGCgGGCCTAGCCT^nTAxrTCcA^ 



SEP W WP; iCsolftte. 
35 DKL2 



36 
37 
39 
38 
35-39 



HK10 
S2 
S54 
S52 
con Ben bus 



550 ATXXnTATGTTTTCAGGaGTCGATGCC 

liiilli liillili ll lllllllll 

550 ATGGTITkTGTlTrCAGGGGTCGATGCC 

iiiiiii iiiiiiii iinini in 

550 ATCCTTATCTTTTCAGGGGTCGAcGCC 

iii n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 in 

550 JUX3ATTATGTTTTCAGGGGTCGATGCC 

1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 II f I 

550 ATGATTATGTTTTCAGGGGTCGA3X3CC 
ATGgTTATGTTTTCAGGgGTCGAtGCC 



WO 95/01442 



22/47 



PCT/US94/07320 



FIGURE IP 



SEP ID NO: isolates 
43 Z7 

42 Z6 

42-43 consensus (Z6) 



1 GTcAACTATCaC^TGCCTCGGGCGTCTATC^^ 

II II Ml II 1 1 II II 1 1 II II I II IMI II I li Mil M II IIIMI 1 1 1 1 1 1 MM I 

X GTtAACTAXCGCAATGCCIXXXSGCGTCTATC^ 
GTtLAACTATCgCAATCCCTCG^^ 



SEP IP WO: Ssol,aJ:e, 
43 Z7 

42 Z6 

42-43 consensus (Z6) 



62 TAaTGTATGAGGCCGAACACCACATCCITlCACCTCC 

ii imiiimiiiiiiiii iii iiiiiiiiiiiiiiiii i iitti riififii 

62 TAGT GTA IXiAGGCCGAACAC^ 

TAgTGTATGAGGCCGAACACCAgATCtTACACCTCC^ 



SEP ID WO: 
43 

42 



Isolate 
Z7 

Z6 



123 gGGGAACCAGTCACGCTGCTGGGTGGCCCTTACTCCC^ cCTTATATCGGT 

Mill I MMIMIK IMMIMMIIMIIMI IIIMI MM I 1 1 1| 1 1 1 1 1 1 1 

123 tGGGAAtCAGTCACGCTGCTGGGTGGCCCTTACTC 



42-43 consensus (Z6) 



tGGGAAtCAGTCAaKrrcCTGGGTGGCCCTTACK: 



SEP ID WO: Isolate 
43 Z7 

42 Z6 
42-43 consensus (Z6) 

SEQ ID NO: Isolate 

43 Z7 

42 26 
42-43 consensus (Z6) 



184 GCaCCGCITGAaTCCaTCCXXsAGACATGTGGACCTC 

II 1 1 1 1 1 1 1 1 III M M 1 1 II 1 1 1 M M M 1 1 1 1 11 M Mill Mill II IMI 

184 GCTCCGCTTIXaAcTCCCTCCGGAGACATGTGGACCT^ 

GCtCCGCTTGAcTCCcTCCGGAGACATGTGGACCTGATGCT 

245 CcGCtCTCTACaTTGGGGACCTGTCCGGTGGcGtATTtTrGGTTC 

i it mum mi ii iiiiiimii i in ilium niiiiii ii n 

245 CtGCCCTCTACgTTGGAGAtCTGTGCGGTGGTGcAT^ 
CtGCCCTCTACgTTGGaGAtCTGTGC 



SEP ID WO: 
43 



Isolate 
Z7 



42 Z6 
42-43 consensus (Z6) 



306 CCAGCCGCGACGCCACTIXXSACTACGCAGGACrc 

1 1 1 1 11 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMI I Mill II Mill I 

306 CCAGCCGCGACGCCACTGGACTACGCAGG^ 

CCAGCCGCGACGCCACTGGACTACGCAGGACTCC^^ 



STO id 



Esq^e 
Z7 



43 

42 Z6 
42-43 consensus (Z6) 



367 ACaGGCCACAGaATGGCATGGGACATGATGATGAACTGGAGTC 

II I M M II I I I 1 1 I I 1 I 1 I f I I ! I I I I 1 I I I I I I 1 I I I I 1 I I I I 1 ! I I 1 I t II I I 
367 ACgGGCCACAGgATGGCATGGGACATGATGATGAACTGGA 

ACgGGCCACAGgATGGCATGGGACATGATGATGAACTC CTG cTt C 



WO 95/01442 



23/47 



PCT/US94/07320 



FIGURE IF 



SEP IP WO: Isolate 
43 Z7 

42 Z6 

42-43 consensus (Z6) 



428 TCGCCCAGGTtJOXyuSGAXCCCTAG^ 

in linn i 1 1 nun urn nun ii ii ii iniii imiimniim 

428 TCGCCCAGGTcATCAGGATCCCTAGCACTCTGCT 
TCGCCCAGGTcMXSJlGGA^ 



JD^nOz jlsolate. 
43 Z7 



42 Z6 
42-43 consensus (Z6) 



489 teTCCrraTcGGGgTGGCalACTTCtGCaTCCAAGC^ 

i nn i i n mi linn i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 inn mm 

489 CgTCCTTGTTCKXjtTGGCCTACTTCAG 

cgTCCTTgTtGGGtTGGCgTACTTCaGtATGCAAGCIM 



seo ID WQ: Isolate 
43 27 

42 Z6 

42-43 consensus (Z6) 



550 ^mTCCTCTaC6CTGGAGTTGAIX5CC 

nniiini n iniiiiiiiiiii 

550 CTTTTCCTCTTCGCTGGAGTTGATCCC 
CTTTTCCTCTtCGCTGGAGTTGATGCC 



WO 95/01442 



24/47 



PCTYUS94/07320 



FIGURE 16 



SgQ ID NO: Isolate 
45 SA1 



47 
49 
46 
50 
48 
45-50 



SA5 
SA7 
SA4 
SA13 
SA6 

consensus 



1 GTtCCCTACCGgAATGCCTCTGGGGTTTAcCATGTCACC^ 

II llllllll II MNMI MM III I 1 1 1 1 1 M 1 1 1 1 II I 1 ( II 1 1 1 1 1 1 1 1 1 1 1 

1 GTCCCCTRCCGAAATCCCTCTGGGGTTTATCA3CT 

llllllllllllllllllll 1 1 1 II 1 1 1 1 M ! I M I f i 1 1 1 1 1 1 II 1 1 1 IIIIMIMI 

L GTCCCCTACCGAAATGCCTCcGGG^^ 

II ! 1 1 1 1 1 1 1 1 1 1 Mill M M I M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 llllllllll 

L (TTTCCCTACCGAAAcGCCTCTGGGGTTnVTC 

MMIIIMIIIM IIMIMIM IMMM II I II II IMIMI MMMI llllllll 

L GTTCCCTACOGAAATGCCTCTGGGGTTTOTCATGTC^ 

lllll Mill llllllllllllll llllllll MMIMIIIMMMM IMIMI 

L GTTCCt^CCGgAATGCCTCTGGGGTgTATCA^ 
GTtCCCTJUTCGaAAtGCCTCtGGGGTtTAtOOOTcACCAAT^ 
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II 



II llll 



Illllllllllll llllllllllllllll I I 



mm! iiiinii iiiiini ii iiiiiiiiiiiiiin iiMiimnn i 



in 



in 



mill 



inn 



i ii illinium minim n iimiiiiiniim inn inn 

TcGTCTACGAGGCTGATGACCTOATCTTA^ 

i mn mill i iiiiimii i imiiiiiiiii iiiiiiiiiiiiii ii i 

TaGTCTAtGRGGCTrGATGACCTGATCcTACACGCTlCCTGGcTGCXTrc 
TaCTdTAcG3lTCCT^taaCCTGnrc-TgCA^CilCCTGGCTGCGTGCCCTGTC?rcaggcA 



AGaTAATGTCACTAGGTGCrTCG{ncaiAATCA^ 

ii in imiiimi i imm immiii i iii i jnii in 1 1 ii i inn 

AGqTAATGTCACnrAGGTGCrnXOTCCAAATCRCCCCCACATTCrrc^ 

I 1 1 1 1 1 1 f 1 1 ! r 1 1 1 ) 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 } I f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f I f 1 1 1 E 1 1 f J 

AaA33UVXGTCiU5IAGGTGCTGGGTCCAAAXCACCCCCACKrT^ 

i iiiiiiiimi 1 1 m 1 1 1 1 it m ii 1 1 1 ii 1 1 1 1 1 iiiiiiiiiiiiii imiii 

AGAIAATGTCAGTAaGTGCTXXSGTCOlAATCACCCCCACgTTGTC 

i inimmi imiiiiiiiii imimiii immmii mm 

GGgTAATGTCAGTASGTCCTGGGTCCAgATCACCCCCACACIXjTCAGCCCCGAGCCTOGGA 

II IMIIIIIIIII llllllll II J II 1 1 1 1 f 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GGaTJUirGTCAGTAGaTGCTGGGTtCAfcATCACCCCCACACTaTCAGC^CCGAGCCinGGA 
agaTJUlIOTCAGXAggTGCTGGGTcCAaATCACCCCCACa-TgTCAGCCCCGAaccrCGGA 
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GCGGTCACGGCTCCTCTTCGGAGGGcCGTTGACTACTTAGCGGGA 

M II I II !i ! ! II! II I M I III 1 1 II MINI 1 1 i I MM I II M MM! MINN 

GCGGTCACGGCTCCTCTTCGGAGGGt CGTTGACTACTTAGCGGGAGGGGCTGCCCTCPGCT 

MMMMMMIMMMMM II ! I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M ! M 1 1 1 1 1 

GCGGTCACGGCTCCTCTTCGGAG^ 

MMMMMMII IIIMI I MMMI M II MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCGGTCACGGCrcCTCTTCGGAGGGCCGTTGACTACTTAG 

1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 iiiiiiinii mi 

GCGGTCACGGCTCCIOTCGGAGGGCanTC^ 

1 1 1 ! M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 M M I II I II I Mill Mill Mill Mill 1 1 1 1 

GCGGTCACGGCTCCTCITCGGAGGGCCGTra 
GCGGTCACGGCTCCTCrrCGGAGGGcCGTTGAcTACtTaGCGGG 



! 1 1 1 1 1 1 1 1 1 1 1 1 1 M IMMIMMMMMIMI M 1 1 1 1 1 N 1 1 1 1 1 ! 1 1 ! I II I 

CCGCACTATACGTCGGGGACGCGTGCGGGGCJU^^ 

I II I MM II IMMMIMI IMMMMIMM I lllllllllll MMMI III 



1 1 1 1 MM II MIIIMM II M MM M M M II III 

CCGCaCTATACGTCGGGGACGCGTGCGGGGCAGTGTTTT 

I I I I llllllllll Mill III Ml II II II II II I 
CCGCGTTATACGTCGGAGACGCGTGCGGGGl 

Mill MM II MIIIMM I llllllllll 

CCGCGTTATACGTCGGAGACGtGTGCGGGGCAt' 



MIIIMM MMMI III 

3TAGGCCAAATCTTCACCTA 

Mill 1 1 II 1 1 1 1 1 1 1 1 1 1 

rAGGtCAAATGTTCACCTA 

II llllllllllllll 
rAGGcCAAATGTTCACCTA 



CCGC - CTATACGTCGGgGACGcKjTCCGGGGCAg^XjITttTGG 
TAGGCCTCGCCAGCATACcACaGTGCAGGACTGCAACTGTTCC^ 

M 1 1 1 1 1 1 1 M ( 1 1 1 1 1 1 ii llllllllllllll JLJJLLJJL* 1 1 1 1 1 1 1 1 Miniiii 

TAGGCCTCGCCAGCATACTACGGTGCAGGACTGCAACro 

1 1 1 II 1 1 1 i I It I M IIIIIIIMIimilllllllMIIIIIIIIIII lllllllll 

TAGGCCTCGCCAGCACACTACGGTGCAGGACTGCAACTXTI^ 

IMIIIIMIIMIIIMIIIMIIII llllllll II II IMIMMIMMMIM 

TAGGCCTCGCCAGCAOVCTACGGTGCAaGAC^ 

ill Mlllll Ml I I Mill llllllll 11 1 1 llllllllllllll III 

TAGcCCTCGCCgGCATJ^TWtGTGCAGGACTGCAACTCtTCCATTTRCaGTGGCCRcATC 

Ml IMMMIMI I II llllllllllllll MMMIMIIIIMM III 

TAGgCCTCGCCaGCATgcTacgGTaCAGGACTGCAACTC 

TAGgCCTCGCCaGCAt ac^acgGTgCAgGACTGCAACTG tTC cATTTACAG tGGCCAtATC 
ACCGGCCACCGqATGGCtTGGGACATGATGAltyiATT^ 

M M 1 1 II II II 1 1 1 1 lllllllllllllllllllllllllllllllllllllll III 

ACCGGCC^^CGAATCGCATGGGACATGATGATGAATTGGTCACCrACGACM 



1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 



M 1 1 M M I M 1 1 1 1 1 M I M 1 1 1 1 



ACCGGCCACCGAATGGCATGGGACATGATGATGAATTGGTCACCTACGACAGCCriXiUlX^ 

lllllllllll 1 1 1 1 1 1 1 1 II 1 1 II I M ! M 1 1 1 1 II I M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 III 

ACCGGCCACCGGATGGCATGGGACATGATGATGAATTGGTCACCTACGACgGCCTTGcTGA 

lllllllllllllllllllllllllllllllllllllllllillill II II Ml III 
ACCGGC»CCGGATGGCATGGGACATGATGATGAATTGGTCACCTACaACJU3CtTTGGI^ 

II MMMMIMIIIIIIMIIIIMIIIIMIIMI Mill I Mill Mlllll 

ACtGGCC^CCGGAlW^TGGGACATGMX^^ 

AC(^^CC^CCGgATGGCaTGGGACATGATGATGAATTGGTCACCtaCgACaGCcTTGglX5A 
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seo id NO: yscflatre 

45 SA1 428 TCX3CCC3tf5aTGCTACGGATcCCCC^gG^ 

1 1 1 1 II 1 1 lllllllli: lllll IIIIIIN MINIM MMMMMMIMM 

47 SA5 428 TGGCCCAGgTGCTACGGATTCCCCAaGTGGTCATtGACATCA 

MINIM 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 MMIM IIMMMIM I IMMII 

49 SA7 428 TGGCCCAGTTGCXACGGAlTCCCCAGGTCXsT^ 

I ! 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 

46 SA4 428 TGGCCOUaTlt j Cr A CGGJCTTCCCCAGCST ^ ^ 

M 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MIMMMMMIM lllllllli 

50 SA13 428 TGGCCCaGTTGrrACGGATTCCC^^ 

IMMII II lllllllllllllllllllllllllllllllllllllll lllllllli 

48 SAC 428 TCGCCCAaaTGCTACGGATTCCCCSU^ 
45-50 consensus TGGCCCAgtTGcTACGGATtCCCCAgGTGGTC^^ 



SEO ID NO: Isolate 

45 SA1 489 GGTCTTGTTtGCCGcCGCATACTTtGCGTCgGCcGC 

lllllllli I I I I lllllllli lllll II II IMMMIIMI M lll lllll 

47 SA5 489 GGTCTTGlTOGCCGtCXKaaaCTTCGOT 

MIIIHIIIIIII llllll IMMII Mill MIIIMI MMMMIIMMM Ml 

49 SA7 489 GGTCTTGTTCGCCGCCGCATAITTCGCGTCA^ 

lllllllli IMIIMIIIMMMMI MIMIIIMMI Ml IMMI I I MMM 

46 SA4 489 GGTCTTGTTCGCCGCCGCATATTTCGCGTC^ 

Ml Ml Ml lllllllllll I llllll I I I I I I I I I I I I 1 I HUM I llllll 

50 SA13 489 GGT C w l ' l\Jiri t:GCCGCCGC3lTACrara 

48 SA6 489 TCTCCTOCTC 

45-50 consensus GGTCTTGTTcGCCGccGCATAcTtcGCGTC-GC^ 
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45 SA1 



47 
49 
46 
50 
48 
45-50 



SA5 
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SA13 
SA6 
consensus 



550 CTGTTcCTGTTTGCGGGGGTCGATGGC 

JIMIMMItl 1 1 1 1 1 1 1 1 1 1 1 M I 

550 CTGTTTCTGTTTGCGGGGGTCGATGGC 

Illlilllllll llUlllillll I 
550 TTGTTTCTGTTTGCGGGGGTCGATGCC 

55 JIIM JIMIM III II lllllllli 
550 cTGTTTCTGTTTGCGGGGGTCGATGCC 

550 ^Illllwdllli iiLLLili;!!!!!! 

- TGTTtCTGnTGCGGGGGTcGATGcC 
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\X V/ 
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(2c) 


26-29 


(XII/2a) 
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<V/3a) 


9-25 


(Il/lb) 


1-8 


(I/la) 


40 


(4a) 


42-43 


(4C) 


44 


(4d) 


41 


(4b) 


45-50 


(5a) 


51 


(6a) 



1 GTGGAAGTcAGgAACAtCAGTTctAGcTACTAcKSCCACCAATC 

1 GTGGAGGTCAAGGACACCGGCGACTC CTACATGCCGACCAACGATTGCTCCAACTCTAGTA 

1 GcccJU^GTGAagAACACCAgracCa^cTAcATGGTGACcAAc^ 

1 CTAGAGTGGCGGAATacGTCtGGCCTCTAtgTCCTtACCAACGACnOT 

1 tAtGAaGTGCgCAACGTgTCCGGGgt gT!AccAtGTCACgAAcGACTGcTCCAACTcaAGca 

1 tACCAAGTgCGOUUTTCcaCgGGgCTt 

1 GAGCACTACCGGAATGCTrcGGGCATCTAT^ 

1 GTtJUlCTATCgCAATGCCTOTGGCGTCTATC^^ 

1 TACAACIJm^CAACAGCTO^riCT 

1 GTCK^CT^CGGAftroCTTCGGGCGTCn^^ 

1 GTtCCc^l^CGaJUltGCCTCtGGGGTtTAn 

1 CTTACCTACGGCAACICCAGTXsGGCTATACCATCTCAC^^ 



1-51 



consensus 



TA 



AC AA GA TG C AA 



SEP ID WO: Genotype 

30-33 (IV/2b) 62 

34 (2C) 62 TCGTITGGCAGCTTGAAGGAGCAGTGCTTCATA 

26-29 (III/2a) 62 TCACcTGGCAaCTccAgGCcGCGGTcCTCC^CTcCCCGGGTG 

35-39 (V/3a) 62 TtGTGTATCAGGCCGATCAO^ 

9-25 (Il/lb) 62 TtGTtTIatGAg^GAgcgGACaTGATcaTGCAcACcCC^ 

1-8 (I/la) 62 TtGTGTACGAGgCgGCcGATgCcATcCTgCAcaCtCCgG 

40 (4a) 62 TAGTCTATGAAGCTGACCATCACATCCTAC^CTTC 

42-43 (4C) 62 XAgTGTATGJ^CCGAACACCAgATCtTACACCTCCCAGGGTC 

44 (4d) 62 TAGTCTATGAAACCGATrACCACATCTTACACCTCCC® 

41 (4b) 62 TAGTGTACGAGACGGAGCACCACATCATG CA<nTGCCAGGGTGTGTCCCCTGTGTCCGGAC 
45-50 (5a) 62 TaGTc^AcGAGGCTGAta^CCTGATctTgCAcGaurCTGGt 

51 (6a) 62 TCGTGCTGGAGGCGGATGCTATGATCTTGCATTTGCCTGGATO 



1-51 



consensus 



T T CA 



CC GG TG T CC TG G 



?EQ ID HO; Genotype 

30-33 (IV/ 2b) 123 cAATGGCACCcTGCgCTGCTGGATAC^AGTgACA^ 

34 (2c) 123 CGCCAACGTCTCTCGATGTTGGGTGCCGGTTC 

26-29 (III/2a) 123 gGGAAAtaCaTCtCGgTGCTGGATACCGGTctCaCCAAAcGTg^^ 

35-39 (V/3a) 123 CGGcAATA»TCcAcGTGCTGGACCcCaGTGACaCCTACaC^^ 

9-25 (Il/lb) 123 gaacAAct cCTCccgcTGcTGGGTaGCGCTcaC tCCCACgCTcGCgGCcAGGAAcgccAgC 

1-8 (I/la) 123 GGgTaaCgcctCGAggTGTTGGGTGgCXSgTGaCCCCCACgGTgGCCACcAGGG^ 

40 (4a) 123 TGGGAACACATCGCGTIGCIGGACGCCGGTGACGCCTACAGTGGC^ 

42-43 (4c) 123 tGGGAAt CAGTCACGCTGCTGGGTGGCCCTTACTCCCACCGTGGCGG tG t CTTATATCGGT 

44 (4d) 123 AGGGAACAAGTCTACATOCTGGGTGTCTCTCACCCCCAC 

41 (4b) 123 GGAGAATACTTXrrCGCTGCTGGGTCCCCTTGA 

45-50 (5a) 123 agaTAATGTCAGTAggTGCTGGGTcCAaATCACCCCCAC^ 

51 (6a) 123 CGATGATCGGTCCACCTGTTGGCATGCTGTGACCCCCACCCTGG^ 



1-51 



consensus 



TG TGG 



T C CC A T C 
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S?0 NO: 


Genotvoe 



30-33 
34 

26-29 

35-39 
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40 
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45-50 
51 

1-51 
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184 GCaCTcJXTCAcAACCTGCGAaCaCAtgTcG 

184 GCTCTCACTAAGGGCCTGCGAGCACACATCGATATCA^ 

184 GCcCTcACGCAGGGCTTGCGGACgCACATcGACATGGTtGTGATGTC 

184 GCAACGACCGCtTCGATACGCAGTCATGTGGACCT 

184 gTCcCcACtAcGaCaATACGACgcCAcGTCGAtTTCCTC 

184 CTCCCcgCAaCGCAgCTtCGACGTcACATCGA^ 

184 GCTCCGCITGAb"in: G TTCCGGCGACATOlXK3^^ 

184 GCtCCGCTTCAcT CCcT CCGGAGACATGTGGRCCTtSa^ 

184 GCICCGCTTGAGTCTTTGAGAanxaCG^ 

184 GCACCGTOUSAGTCCATGOGCAGGOOT 

184 GCGGTCACGGCTCCTCTTCGGAGGGcC<tt 

184 JWOGCCCGCAACGCMATTCOGCAGGCJaCTGCS^^ 



T G 



T GA 



T G 



GC 



T TG T 



245 CGGCCTlXrniTGTGGGaGACgTgTGCGGG 
245 CTGCCCTTTATGTGGGGGAOSIXjTGTGGCGCGC^ 
245 CcGCtCTtTACGTGGGGGAcCTCTGCGGcGGGgTgAT^ 
245 CTGCGCTCTAcGTGGGtGATaTGTGlXjGGGCCGTCxTtCT 

245 CCGctJVTGTAcGTGGGgGAtCTcTGCGGaTC tGTt tTCCTcgTcTCcCAG cTGTP CACc tT 

245 CGGCCCTCTAcGTGGGGGACtTGTGOGGGTCTGTCTTtCTtff 

245 CTGCCCTCTTOXnTCGGGACCTC^ 

245 CtGCCCTCTACgTTGGaGAtCTGTGCGGTGGtGcAl^^^ 

245 CaaCCCTCTACATCGGAGACGTGTGTGGGGGlXJlVI'^ 

245 CCGCCTTCTACATTGGAGATCTGTGTGGAGGCCT 

245 CCGCgcTATACGTCGGgGACGcGTGCGGGGCAgTGTTttTO 

245 CATCCClXnrACATCGGGGACCTGTGTGGCTCTCT 

C T TA T GG GA TG GG TT CA T 

306 ATCGCCaGAACgCCACaACTTtACCCAaGAGTGCAACTGTTCCATC^ 

306 GTCGCCACAACACCATACGTTTGTCCAGGAATGC 

306 CTCTCCGCaaCacCACTg^nTTGTGCAaGAaTGCAAtTG^ 

306 CAGACCtCGTCGCCATCAAACgGTCCAGACCTGTAACTGCT 

306 cTCgCCtCGcCggcAtgaGACagtaCAGgAcTGcAAcfrGcTCaaT^^ 

306 cTCtCCCMgCgCCaCTGGACaACGCAaGaCTO^ 

306 TCGGCCGCGTCGCCACTGGACCACGCAGGAG 

306 CCAGCCGCGACGCCACTGGACTACGCAGGACTGCAATTGTTC tAXCTAcGCaGGGCAt aTc 
306 CCAACCTCGCCGCCACTGGACCACCCAAGACTGCAATTO^ 
306 CCGACCGCGCCGGCACTGGACCACCCAGGATTGCAACTCCTC^ 
306 TAGgCCTCGCCaGCAtactacgGTgCAgGACTGCAAcTGt 
306 TCAGCCCCGCCGTCATTGGACltnCCAAGACTGC^^ 

CC C CA TG AA TG TC T TA GG T 
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367 ACOGGCCACCGCAlGGCaTGGGACATGATGCTaAA 

367 ACGGGACACCGCATGGCTTGGGATAIX^TGATGAACTCGTC 

367 ACtGGaCACCGTATGGC^TGGGAcATGATGATGAACTGGTCGCCCACg 

367 TCAGGACATCGaATGGCTTGGGATATGATGATGAATTGGTCCC 

367 tCAGGTCAcTOcATGGCtTGGGAtATGATGATGAAcTGGTC 

367 ACGGGtCAcCGcATGGCaTGGGAT^TGATGATGAACTX^^ 

367 ACCGGCCyiCAGGATGGCGTGGGACATGATGATCAAC^ 

367 ACgGGCCACAGgATGGClATGGGACATGATGATGAACTGGACT 

367 ACAGGACACAGAATGGCT1X3GGACATGATGA^ 

367 TCGGGCCACAGGATGGCCTGGGACATX^TGATGAACT^ 

367 ACcGGCCACCGgATGGCaTGGGACATGATGATGAAlTGGTCACC 

367 ACCGGCCACAGGATCGCT1X5GGACAIGATC 

C GG CA G ATGGC TGGGA ATGATG T AA TGG CC C T T 
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STTGCCGGAGGCCACXGGGG 



T C 



G T CC 



T T 



GG G CA TGGGG 



T T 



GC T T 



TGG AA GT 



T T T 



C GG GT GA G 
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1 YQVRNSTGLYHVTNDCPNSSrVYEtADAILHaPGCVPCVREGNtS 

I INI I !! i I II 1 1 Ml INI 1 1! HUM lllllllllli 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 

1 YQVRNS TGLYKVTND CP NS S IVYEAAD AILHTPGCVP CVREGN vS RCWVAMTPTVAXRDGK 



1 YQVRNSTGLYHVTNDCPNS S rVYEAADAILHTPGCVPCVREGNaSRCWVAMl'PT VATRDGK 
1 HQVRNSTCTYHVTNDCPNSSIVYEAADAILHTPGnTPCV^ 



1 HQVRNSTGLYHVTHDCPNS S IVYEAADAILHaPGCVPCVREGNAS RCWVAVTPTVAXRDGK 
1 YQVX^SSGLYHVm>CPNSSr\nfEAADAXIHSPGCVFCVREGK^ 



1 YQVireSSGLYHVTNDCPNSSIVYETAnAIIJISPGC^ 

1 YQVRNS tGLYHVTNDCPNSS IVYBTAD t ILHS PGCVT>CVREgiiA£rCWpVAPTVATRDGK 
yQVRKStGLYHVTKDCPNSSrvyEaADalLH- PGCVPCVREgiiasrCWVavtPTVATRDGK 



62 IJatQU*RyIDIJArt3SATI<CSALY^^ 
62 LPTaQLRRHIDIJjVGSATLCSALYVGDLCGSVFLVG^ 



III 

62 LPTOQLRRHIDIjLVGSATLCSALYVGDLCGSVFLVGQLFTO^ 
62 'iJ4TgiiRRHID^ 



62 LPTTQLRRHIDIXVGSATLCSALYVGDLCX5SVFL 
62 ' LPM^lJJ^ 

62 I^ATQLRRHIDIiVGSATLCSALYV^ 
62 LPJ^QI^ 

LP -tQUimilDLLVGSATLCSALYVGDI^SVFLVggLFTf 

123 TGHRMAWDMMMNWS PTTALWAQLLRI PQAI LDMIAGAHWGVLAG IAYFSMVGNWAKVLW 
123 TGHIttlAWDMMMNWSPITALVVAQIJJ^ 

123 TGHRMAWDMMMNWS PTaALWAQLLRI PQAI LDMIAGAHWGVLAG IAYFSMVGNWAKVLW 



123 TGHRMAWDMMMNWSPTTALWAQLLRI PQAI LDMIAGAHWGVLAG IAYFSMVGNWAKVLW 

II II 1 1 II I Ml Mill III IIIMMIMMI II IMIIMI IMMIIM Mill II 

123 TGHRMAWDMMMNWS PTTALVMAQLLRI PQAI LDMIAGAHWGVLAG IAYFSMVGNWAKWW 



123 TGHRMAWDMMMNWS PTaALVMAQLLRI PQAI LDMIAGAHWGVLAGIAYFSMVGNWAKWW 

IIIIIIIIIMIIMI III IIIIIMII MMIIII M MMII MM II I II II .1 

123 TGHRMAWDMMMNWS PTTALVvAQLLRI PQAVLDMIAGAKWGVLAGIAYFSMVGNWAKVLiV 
123 TGsJJlAHDMM^^ 

TGHRMAWDMMMNWS PT LALVvAQLLRi PQAi LDMIAGAHWGVLAG IAYFSMvGNWAKVl vV 
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FIGURE 2A 



SEP ID NO: 
56 



52 
59 
55 
54 
53 
58 
57 
52-59 



Isolate 
S14 

DK7 

OS11 

DR4 

DR1 

DK9 

SHI 

S18 

consensus 



184 



184 



184 

I 

184 



184 



184 



184 



184 



LLLKAGVDA 

iniiiiii 

LLLEAGVDA 

MINIMI 
XiLLFAGVDA 

Mini 

LLLKAGVDA 

lllllllli 

LLLEAGVDA 

in mi 

T.T.T.FtGVDA 

llll llll 
LLLFsGVDA 

Mil llll 

LLLFaGVUA 



LLLFaGVDA 
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FIGURE 2B 



SEO ID NO: 


Isolate 




75 


T10 


1 


62 


DK1 


1 


64 


HK4 


X 


76 


US 6 


1 


68 


XHD8 


1 


67 


IND5 


1 


73 


SH2 


1 


63 


HK3 


1 


66 


HK8 


1 


61 


D3 


1 


74 


T3 


1 


65 


HK5 


1 


71 


S45 


X 


72 


SA10 


1 


69 


P10 


1 


60 


Dl 


1 


70 


S9 


1 


60-76 


consensus 





1 7EVRKVSGmYHV17CCSNSSrV£EAaDlXM!nTGCVFCVREgNsSRCWVALTFTLAAHIItS 



II 



II I I I I 
lEVhUVSG: 

II Mil 

rcVKlTVSGmYHVTOTCSNSSrVYEAADMIM^ 

I M 1 1 1 1 IIIMMMMI IIIIIMmilll Mill I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

EVRNVSGVYHVTOTCSNSSrVYEAADMIMHTPG^ 
1 YEVRNVSGVYHVTCTOCSNSSr^ 
BVRNVSG 

lllllll 



llllllllllllll 

VYETADMZMHTFGC 
1 yEWRHrVSGVYqVTNDCSNSSIVYETM^^ 



lllllll llllll IMIIIIIIIIIIII MINIMI llllllll 

nr\TTNDCSNlSIVYKTtJDM3MHTPGCVPCVRENNSSRCW\^^ 

MINIMI Mill I I 



ii milium i m 

YBVRKVSGaYHVTNDCSNSS IWEaADvIMOTPGCVPCVqBgRSSqCWVaurPTLAAHRat 
yEVrmrS(^rYhVrNDCSNsSiVyBaaDmI»irTP<K>rP^ 
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FIGURE 2B 

SEP XD NO: Isolate 

75 T10 62 vPTTTIRRHVDIJiVGAAAFCSAMYVGDLCGSVFLVSQLFCT 

MMII M!lli!!lill!Miimili:illllt!lili lllill IMIIIIIiil 

62 DK1 62 IPTITIRRHVDIJ.VGJU^CSAMyVGDLCGSVFLVSQr^ 



64 HK4 62 I PUTriHkKHV DIiLVGAAAFCSflMyVGDLCGSVFLVSQLF^ 

INI Mill MINI I 1 1 1 1 ! 1 1 1 N N 1 1 1 1 MIMIMI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

76 US 6 62 VPTTTIRRHVDLLVGAAt^CSAMYVGDLCGSVFIiiSQLFTF^ 

1 1 IMM III IIIMI I IMIIIIIMIIMM 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 ! 1 1 f 1 1 

6B IND8 62 VPTTTIRRHVDLLVGAAAFCSAMYVGDIOT 

67 IND5 62 VslTTTRhHVDLLVGAAAF 

73 SW2 62 VPTTTIRRBn/DLL^^ IYPGHV 

63 HK3 62 VPTTTIOTHVDLLVGAAAFCSJVMYVOTLTO 1 YPGHV 

66 HK8 62 VPTTTIRRHVDLLVGAAAFCSAMYVGDI^^ 



61 D3 62 VPTTTIRIHIVDIiLVGAAAFCSfl!^^ 

74 T3 62 VOTJOTRRHTO^ 

65 HK5 62 TOTTalRRHTOIiL^^ IYPGEV 

71 S45 62 TOTTTIRRHVDLLVGAAAFCSJVMyV^ 

IMM 1 1 MMMMMIMMMIIMIMMMI MMMIM 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

72 SA10 62 VPTTTIRRHVDIiLVGAAAFCSAM3fVGDLCGSVFLV^ 

69 P10 62 VPTTAIRRHTOLLVGAAAFCSMiy^ 

MMII MIMMMMIIIMMMMMM I IMM Mill III llllllllll 

60 Dl 62 VPTTAIRItflVDI^VGAAAFCSAMra 

(III IMIIIIIIIM I I I I I I I M I I I I I I I I I 11 I ! Illllllll llllllllll 

70 S9 62 VPTTtlRRHVDLLVGAAvFCSAMYVGDLCGSVFLISQLFTiSP 

60-76 consensus vpTttlRrHVDLLVGAAaFCSaMYVGDI^ 
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FIGURE 2B 

SBO ID NO: Isolate 

75 T10 123 SGHRMAWDMMMNWSPTTJtf-WSQLI^PQAVmDMVtGAI^^ 

!! 1 1 1 M II I! MMIMM HUM!!!! III I ! V S 1 1 1 1 ! I f I E 1 1 1 1 1 1 1 1 i 1 1 1 

62 DK1 123 SGHRMAWDMMMNWS PTTALV1 SQLLRI PQAVvDMVAGAHWGVIJtfSIJlYYSMAGNWAKVLI V 

II I II 1 1 II II Ml II III Mill MM I II III I MIMM Mill IMIIMM 

64 HK4 123 SGHRMAWDMMMNWS PTAALWSQLLRlPQAVMDMVAGAHWGVIiAGIiAY^ 

76 US6 123 SGHRMAWDMMMNWSPTAALVVSQIXRIPQAVMDMUJ^GAHWG 

68 IOT8 123 SGHRMAWDMMMNWSPTAALVVSQLLItf PQAVVDMVAGAHWG 
67 IND5 123 SGHRMAVTOMMMNWSPTAALVVSQLI^ 

73 SW2 123 ScUcRMAWDMMi^^ 

63 HK3 123 SGHRMAWDMMMNWSPTAALWSQLLRIPQAVV^^ 
66 KK8 123 SGHRMAWDMMMNWS PTtALVVSQLI*MPQ^ 

61 D3 123 TGHRMAWDMMMNWSPTaALVVSQLlJlIPQAVVDMV^^ 

74 T3 123 TGHRMAWDMMMNWSPTIALWSQLLRIPQAVVD^ 

65 HK5 123 TGHRMAWDMMMNWSPTTALVVSQLLRIPQAVVDMVAC^^ 

71 S45 123 TCHRmIwD^^ 

72 SA10 123 TGHRMAWDMMMNWS PTtALWSQLLRI PQATTOMVAGAHWGVIAGIJireSMVGNWAKVU V 

I M II II M I II 1 1 1 I It M 1 1 II 1 1 M 1 I I M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 

69 F10 123 SGHRMAWDMMMNWS PTaALVVSQLLRI PQAI 1DVVAGA 

60 Dl 123 TGHRMAWDMMMNWSPTTALWSQLLJtlPQAVMDMVAGA^^ 

70 SB 123 TGHRMAWDMMMNWSPTTALWSQLLRIPQAVMDMVA^^ 

60-76 consensus SGHRMAWDMMMNWS PTaALVvSQLLRi PQAvvDmVaGAHWG vIJ^GIATYSMyGNWAKVLIV 



4 
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FIGURE 2B 



SEP TP NO: Isolate 
75 TIO 



62 
64 

♦ 76 
68 
67 
73 
63 
66 
61 
74 
65 
71 
72 
69 
60 
70 



DK1 
HK4 
US6 
H3D8 

mos 

SW2 
HK3 
HK8 
D3 
T3 
HK5 
S45 
SA10 
P10 
Dl 
S9 



184 mLLFAGVDG 

: : i : 1 1 1 1 

184 1LLFAGVDG 

I 1 I I 1 I f i 
184 mLLFAGVDG 

llllllll 
184 1LLFAGVDG 

llllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

llllllll 
184 1LLFAGVDG 

llllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 

lllllllll 
184 MLLFAGVDG 



60-76 consensus 



mLLFAGVDG 



WO 95/01442 



36/47 



PCT/US94/07320 



FIGURE 2C 



$EQ ID ^0: 
77 

78 

79 

80 

77-80 



^sola^e 
T2 

T4 

TO 

usio 

Consensus 



1 AQVrNTsrgYMVTNDCSNeSITWQLQAAVLHVPGCiPCErlGNTSRCWIFVtPNYaVROPG 

iii ii iniiiiii iiiiiiiiiniiiii iii niiiiiiii iiiiiin: 

1 AQVIOn'tllSYMVTICCSroSZTWQLQAAVIiHVFGCVPCEktGOTSIOnPVSPNVAVnOPQ 

I (III I Nil I II III Mill llll lllll I II II I II I II II II llllll II 

1 AeVKOTSTSTMVTNDCSinDS rrWQLQAAVLHVPGCVPCT 

I II II II IMIIIIIIMIMM IIIIIIIIIIIII III III II II IMIIIMM 

1 vqVKNTSTSYMVTNDCSITOSITWQLeAAVLHVPGCTO 

aq^VTclTrstsYMVTroCSNdSITWQIjqAAV^ - vGNtSRCWIPVsPNVAV- - PG 



SKO TP NO: 
77 

78 

79 

80 

77-80 



Isolate 
T2 

T4 

T9 

US10 

consensus 



62 lUjTQGIilHHIDMVVMSATIiCSALYVOTL 

II Ml II INI llll IMMIMI IIIIIMIIIIIIIM IN lllll llllllllll 

62 J^TQGIjRTHIDM\AfllSATLCSALYVGDLCGGVMI^^ PQHHWFVQdCNCS I YPGTT 

llllll MNMIIMMMIIIIIIIIIMIIIIIIIII IIIMMII llllllllil 

62 ALTQGLinffiDMVVMSATLCSALYVG 

IIIIIMI IIIIIMMIIIIIIIMI III lllllill II II 1 f 1 1 1 1 1 1 1 1 1 1 1 1 

62 ALTQGUmaDMVVMSATKSALYVGD^ 

iU-T^LRTHIDMVVMSATLCSALyVGDlC^ 



SSQ ID NO: 
77 

78 

79 

80 

77-80 



Isolate 
T2 

T4 

T9 

US10 

consensus 



123 TGHRMAWDMMMNWSPTAimLAY^ 

II 1 1 1 M II IIIMIIIMIMMM I Mil II I MMIIMM 1 1 II IMMII II 

123 TGHRMAWDMMMIWSPTAIMIIAyAMKVPEVI ID I vSGJVHWGVMFGIAYPSMQGAWAKVVVI 
illlllllllllllll llllllllllllll II Ml II MIMIIIIIIII llllll || 
123 TGHRMAWDMMMOTSPTtIMIIJV5f3^MRVPEVI IDI ISGAHWGVMFGLAYFSMQGAWAKVVVI 

Illlllllllllllll I llll I IMMMIII MMIM MINI II li MIIIMI 

123 TGHRMAWDMMMNWSPTaTl ILAYvMRVPEVI ID 1 1 SGAHWGVl FGIAYFSMQGAWAKVWI 
TGHRMAWDMMMNWSPTaTMIAYaMRVPEVI iD I i sGAHWGVtoFGLAYFSMQGAWAKVvVT 



SKO TD HO: 
77 


Isolate 
T2 


184 


LLLAAGVDA 


78 


T4 


184 


INIIIIII 
LLLAAGVDA 


79 


T9 


184 


III lllll 
LLLtAGVDA 


80 


US10 


164 


III lllll 
LLLaAGVDA 


77-80 


consensus 




LLLaAGVDA 
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FIGURE 2D 



5EO ID NO: 
82 

B3 

64 

81 

81-84 



DK11 
SW3 
T8 
DK8 
consensus 



VEVRNt S S S YYATNDCSNnS ITWQLTNAVLHLPGCVPCENDNGTLHCWIQVTPKVAV^ 

1 1 1 1 1 IMMMMMi MMIMM Ml II I MM I ! M MIMM M M MM MM 

VEVraiSSSYTATrroCSNsSITWQL^ 

Mill I 1 1 III lllll Mlllllli III llllll Ml Mill III MM MM MM 

VEVIWtSf SYYATTOCSNNSriVQI*TriAV 

I I I I 1 I I II 



lllll I 1 1 M II 1 1 1 1 1 1 1 1 1 1 ) I I [ 1 1 1 1 1 1 1 i II I E i 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 1 1 

VEVraiSsSY^TNDCSRNSITWQLTd7lVLHIJ>GCVPCEro 



VEVRN - S s SYYAITOCSNhSITWQLTnAVLHLPGCVPCENDNGTL - CWIQVTPNVAVKHRG 



SBO ID WO: 
B2 

83 

84 

81 

81-84 



Isolate 
DK11 

SW3 

T8 

DK8 

consensus 



62 ALTHNLJtAHiDMTVMAATVCSALYVGDvCGA PEnHhFTQECNCS I YQGhl 

lllllllll lllllllllllllllll t 1 I I I I I I I f I I III I I I f I I I I 1 I I 1 I I 
62 ALTHNLRRHVDMIVMAATVCSALYVGDmCGAVMrVSOAFIIS PERHNFTQECNCSIYQGr I 

lllllll III !! 1 1 1 1 1 1 1 1 1 1 1 1 1 llllll 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 i I ! 1 1 1 ! 1 1 1 I 

62 ALTHNLRTHVDVTVMAATVCSALYVGDVCGA^ 

1 1 1[ 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

62 ALTHNLRTHVDVTVMAAWCSALYVG^ 

ALTHNLR- HvD - IVMAATVCSALYVGDvCGAVMIvSQAf IiSPErHnFTQBCNCSIYQGhl 



SEP IP WO: 
82 

83 

84 

81 

81-84 



Isolate 
DK11 

SW3 

T8 

DKB 

consensus 



123 TGHRMAWDMMLNWS PTLTMI IAYJUU^VPELVLEVVFGGHWGVVFGIJiyFSMQGJVWiUCVIM 

11 1 mi 11 11 iiiii 11 11 iitiiiiiiiiiiiiii 111 mimm 1 imimii 

123 TGHRMAWDMMLNWS PTLTMLAYRARV^ 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 i 

123 TGHRMAWDMMLNWS PTLTMIIJVEAAKVPELVLEVVFGGHWGVVFGIiATO 

1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 f 1 1 1 1 1 1 1 1 1 1 I 1 ! f f f f 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

123 TGHRMAWDMMLNWSPTLIMIIAYAARVPELaI*qVVFGG^^ 

TGHRMAWDMMLNWS PTLTMILAYAARVPELvLeVVFGGHWGVVFGLAYF^ 



SEP ID WO: 
82 

83 

84 

81 

81-84 



Isolate 
DK11 

SW3 

T8 

DK8 

consensus 



184 LLLVAGVDA 

mmm 

184 LLLVAGVDA 

lllllllll 
184 LLLVAGVDA 

mmm 

184 LLLVAGVDA 
LLLVAGVDA 
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FIGURE 2E 



SEQ ID NO: 
86 

87 

88 

90 

89 

86-90 



Isolate 
DK12 

HK10 

S2 

S54 

S52 

consensus 



LEWRITOGLYVLTiroCsNSSIVYEADDVIIinTC 

1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 ! 1 1 : i : ! 1 1 1 1 : 1 1 1 i 1 1 1 : i ! : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

I^WRNVSGLYVLTNDCpNSSIVYEADDVIIOT 

Mill llllllllll Ml MM IMIIIMIM! IMI MM II III MIMMMM 

LEWRNTSGLYVLTND CSNSS I VYEADDVT LHTPGCVPCVQDGNTSTCWTP VTPTVAVRYVG 

1 1 1 1 1 1 1 f 1 i 1 1 M f ! 1 1 M [ 1 1 1 1 [ 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M i 1 1 1 1 1 1 1 1 1 1 1 1 

LEWRNTSGLYiXTND CSNSS TVYEJ^DVILHTPGCVPCVODGOT 

llllllllll IIIIIIIIIIMII! Illlllllllll Nil III IIIMIIIIIIIIII 

LEWRNTSGLYVLTND CSNSS IVYEADDVI LHTPGCVPCVQIKSNTSmCVrrPVTP'r VAVRYTO 
LEWRN tSGLYvLTNDCsNS S IVYEMDVIIjnTGCVPCVQDGOTStCWTfcVTPTV 



SEP IP NO: 
86 

87 

88 

90 

89 

86-90 



Isolate 
DK12 

HK10 

S2 

S54 

S52 

consensus 



62 ATTASIRSHVDI^VGAATMCSALYVGDvCGAVFLVGQAFTO 

MIMIIIMIIIIMIIIMIMIM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

62 ATTAS IRSHVDLLVGAATMCSALYVGDMCGAW 

IMM M MIMIIM II 1 1 MM MM II Ml Ml MIMI II IMMI M IMIIIM I 

62 ATTAS IRSHVDLLVGAA1WCSALYVGDMCGAVFLVGQAFT 

MMMMIMMIMM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

62 ATTAS IRSHVDI^VGAATLCSALYVGDMCGAVFL^ 

I [ 1 1 f 1 1 1 1 1 1 ! 1 1 i 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 [ 1 1 11 1 1 1 1 1 1 1 1 1 M 1 ! tl 1 1 1 1 1 1 ! I 

62 ATTASIRSHVDLLVGAATLCSALYVGDMCGAVFLV^ 
ATTAS IRSHVDLLVGAATtaCSALYVGDmCGAVFLV^ 



SEP ID NO: Isolate 
86 DK12 



87 
88 
90 
89 
86-90 



mao 

S2 
S54 
S52 

consensus 



123 SGHRMAWDMMMNWS PAVGMWAHVLRLPQTLFD I IAGAHWGImAGLAYYSMQGNWAKVAI I 

i 1 1 f 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 ! 1 1 1 f 1 1 1 1 1 f f 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 

123 SGHRMAWDMMMNWS PAVGMWAHVLRLPQTLFD I IAGAHWGIIAGLACTSMQGNWAKVAI I 

Ml M IMMI M II II II I MM IMM I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 ! i 1 1 1 1 

123 SGHRMAWDMMMNWSPAVGMWJtfTVTJlLPQTvFDIIAG 

IIIIIIIIIIMII MM MIM llllll III I I I I I I I I I I I I I I I I I I I I I I I I I I 
123 SGHRMAWDMMMNWSPAVGMVVAKIIJUiPQ^FDIIAGAHWGIIJ^ 

II MIMMMMM MIMIMMM IMMI IMMI MMMMI M M M MM M 

123 SGHRMAWDMMMNWSPAVGM\A/3^ILRLPQTLFDIIiAGAHV^ 

SGHRMAWDMMMNWS PAVGMWAHvLRLPQTl FDI iAGAHWGI lAGIiAYYSMQGNWAKVAI i 



SEO ID NO: 
86 


Isolate 
DK12 


184 MVMFSGVDA 
MIMIIM 
184 MVMFSGVDA 


87 


HK10 


88 


S2 


MMMMI 
184 MVMFSGVDA 

1 Illllll 


90 


S54 


184 MIMFSGVDA 
MIMIIM 
184 MIMFSGVDA 


89 


S52 


86-90 
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MvMFSGVDA 
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FIGURE 2F 



SEP ID NOr Isolate 

93 Z7 

94 26 
93-94 consensus (Z€) 



1 VNYnNASGVYHiTNDCPNSS ImYEAEHHILHLPGCVPCVRe^ 

in Minn minimi inn mini mm iiiiiiiiiiiiiii in 

1 VNYRNASGVYHVraDCPNSS IVyEAEHqILHZ<PGClPCVRvGRQSRCWV^TPTVrAvsTX^ 
VNYrlBVSGVYHvTNDCPNSSIvYKAEHqIIflI«^ 



SKO TO WO: 
93 



Isolate 
27 



94 26 
93-94 consensus (26) 



62 APLESilKHVDLMVGAAXrcSAItTO 

III I llllllllllllllllil llllll IMMIMIMMMMM IMMMII 

APIxiSlRimVDLMVGAATVCSALYv^ 



93 



Estate 
27 



94 26 
93-94 consensus (26) 



.123 TGHRMAWDMMMNWSPTTTLvIAQVMRI^ 

I ! 1 1 1 1 1 1 1 1 1 M 1 1 1 ! 1 1 I II It I ! i II 1 1 1 1 ! Mill I I III 1 1 1 1 1 1 1 M 1 1 

123 TGHRMAWDMMMNWSPTTTL1IAQVMRIPSTLTO 
TCaffiMAWDMHMNWSPTTTLlIAQTO 



SEP ID NO: 
93 



Isolate 
Z7 



94 26 
93*94 consensus (26) 



184 LFLyAGVDA 

III lllll 
184 LFLFAGVDA 

LFLfAGVDA 
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FIGURE 2G 



SEO ID NOr 
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SA7 
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97 
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96-101 
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SEO ID WO: 
98 


Isolate 
SA5 


62 


100 


SA7 


62 


97 


SA4 


62 


96 


SA1 


62 


99 


SA6 


62 


101 


SA13 


62 


96-101 


consensus 




SEO ID NO: 
98 


Isolate 
SAS 


123 


100 


SA7 


123 


97 


SA4 


123 


96 


SA1 


123 


99 


SA6 


123 


101 


SA13 


123 


96-101 


consensus 





VPYRJ^GVYHVTITOCPNSSIVYRMra^IJiAPGCVTCVkegNVSR 

IIMIIIIIIIMIINMIIIMIIIII IIIIIIIMI I! 1 1 1 ! 1 1 [ 1 1 ] 1 1 1 M I ! 

VFSfRHASGVYHVTimCPHSSrTOEADOTilXHAPGO^CVRQnl^ 

I f 1 1 1 ( M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 ! 1 1 II 1 1 ! 1 1 1 1 III lllllllllllllll 

VFYRHASGVYHVTNDCTNSSIVYRADNLIiaAPGC^ 

III IIMIIIIIMIIIIIIIIIill llllllllllllllllll IIMIIIIIMI | 

VPTlOSD^GVravraDCPNSSIVyBADsiai-HAPGCVPCV^ 

III1IIIIIIIIIIIIIIIMIIII! I.I 1 1 1 1 1 1 1 1 1 1 1 lllllfll llllllll I 

VFYRNASGVYHVTNDCPNSS rVYEADDI^LHAPGCVPCVRkDNVSRCWVhXTPTI*SAPSLG 

II1IIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIMII lllllll llllllllllf 

VPYWOXSGVYHVTOTC^SSIVYKADDLILHAPGCVP^ 
VPYWD\SG\nfH\nCOTCPNSSIVyKADnIiirjI^ 



AVTAPLRRvVDYLAGGAALCSALYVGDACGAVFL^ IYSGHI 
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FIGURE 2G 
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